<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=StevanEarl</id>
	<title>sokwedb - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=StevanEarl"/>
	<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/wiki/Special:Contributions/StevanEarl"/>
	<updated>2026-06-10T15:28:06Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.6</generator>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=617</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=617"/>
		<updated>2026-06-10T01:15:38Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Cross-reference problems #85 and #87&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; AS fol_cl_community_id_untrimmed,&lt;br /&gt;
*&lt;br /&gt;
FROM easy.follow&lt;br /&gt;
WHERE RTRIM(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Notes ===&lt;br /&gt;
&lt;br /&gt;
But see problem #87. This problem is only relevant if codes.food_names.description is derived from fl_sci_food_name (i.e., not fl_sci_food_name_gen); else, refer to #87.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. These issues are detailed below. Please note that this problem concerns the food_part_lookup table specifically.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#87) There are numerous duplicate fl_sci_food_name_gen values in food_lookup. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 42 instances where fl_sci_food_name_gen values are associated with more than one fl_local_food_name value. This problem assumes that we are using the fl_sci_food_name_gen value for food_lookup.description (i.e., instead of fl_sci_food_name_gen). If instead, food_lookup.description should reflect fl_sci_food_name_gen then this particular problem is moot and can be ignored (but other problems will certainly arise with the switch to fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH ranked AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        fl_local_food_name,&lt;br /&gt;
        fl_sci_food_name,&lt;br /&gt;
        fl_sci_food_name_gen,&lt;br /&gt;
        COUNT(*) OVER (&lt;br /&gt;
            PARTITION BY BTRIM(fl_sci_food_name_gen)&lt;br /&gt;
        ) AS sci_gen_count&lt;br /&gt;
    FROM clean.food_lookup&lt;br /&gt;
    WHERE fl_sci_food_name_gen IS NOT NULL&lt;br /&gt;
      AND BTRIM(fl_sci_food_name_gen) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    fl_sci_food_name_gen,&lt;br /&gt;
    sci_gen_count&lt;br /&gt;
FROM ranked&lt;br /&gt;
WHERE sci_gen_count &amp;gt; 1&lt;br /&gt;
ORDER BY fl_sci_food_name_gen, fl_local_food_name, fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Notes ===&lt;br /&gt;
&lt;br /&gt;
But see problem #85. This problem is only relevant if codes.food_names.description is derived from fl_sci_food_name_gen (i.e., not fl_sci_food_name); else, refer to #85.&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=616</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=616"/>
		<updated>2026-06-09T23:34:07Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #87 - There are numerous duplicate fl_sci_food_name_gen values in food_lookup.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; AS fol_cl_community_id_untrimmed,&lt;br /&gt;
*&lt;br /&gt;
FROM easy.follow&lt;br /&gt;
WHERE RTRIM(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. These issues are detailed below. Please note that this problem concerns the food_part_lookup table specifically.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#87) There are numerous duplicate fl_sci_food_name_gen values in food_lookup. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 42 instances where fl_sci_food_name_gen values are associated with more than one fl_local_food_name value. This problem assumes that we are using the fl_sci_food_name_gen value for food_lookup.description (i.e., instead of fl_sci_food_name_gen). If instead, food_lookup.description should reflect fl_sci_food_name_gen then this particular problem is moot and can be ignored (but other problems will certainly arise with the switch to fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH ranked AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        fl_local_food_name,&lt;br /&gt;
        fl_sci_food_name,&lt;br /&gt;
        fl_sci_food_name_gen,&lt;br /&gt;
        COUNT(*) OVER (&lt;br /&gt;
            PARTITION BY BTRIM(fl_sci_food_name_gen)&lt;br /&gt;
        ) AS sci_gen_count&lt;br /&gt;
    FROM clean.food_lookup&lt;br /&gt;
    WHERE fl_sci_food_name_gen IS NOT NULL&lt;br /&gt;
      AND BTRIM(fl_sci_food_name_gen) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    fl_sci_food_name_gen,&lt;br /&gt;
    sci_gen_count&lt;br /&gt;
FROM ranked&lt;br /&gt;
WHERE sci_gen_count &amp;gt; 1&lt;br /&gt;
ORDER BY fl_sci_food_name_gen, fl_local_food_name, fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=615</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=615"/>
		<updated>2026-06-03T00:29:53Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: minor edit to Problem #84 language&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; AS fol_cl_community_id_untrimmed,&lt;br /&gt;
*&lt;br /&gt;
FROM easy.follow&lt;br /&gt;
WHERE RTRIM(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. These issues are detailed below. Please note that this problem concerns the food_part_lookup table specifically.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=614</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=614"/>
		<updated>2026-06-03T00:29:20Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: update query for Problem #84 bad data&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; AS fol_cl_community_id_untrimmed,&lt;br /&gt;
*&lt;br /&gt;
FROM easy.follow&lt;br /&gt;
WHERE RTRIM(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. These issues are detailed below. Please note that this problem concerns the food_part_lookup table specifically.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=613</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=613"/>
		<updated>2026-06-02T23:28:00Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Clarify that Problem #86 is regarding food_part_lookup&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances in the food_part_lookup that that conflate name, initial, and/or the english translation. These issues are detailed below. Please note that this problem concerns the food_part_lookup table specifically.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=612</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=612"/>
		<updated>2026-06-02T23:20:45Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: markdown list format to mediawiki list format&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances of local food parts that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances of local food parts that that conflate name,&lt;br /&gt;
initial, and/or the english translation.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
* there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
* need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
* the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
* different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
* `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
* how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=611</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=611"/>
		<updated>2026-06-02T23:18:57Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: fix hard coded line ending&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances of local food parts that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances of local food parts that that conflate name,&lt;br /&gt;
initial, and/or the english translation.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
- there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
- need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
- the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
- different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
- `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
- how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=610</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=610"/>
		<updated>2026-06-02T23:16:26Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Turns out that Mediawiki has its own table format and does not play well with markdown tables.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances of local food parts that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances of local food parts that that conflate name,&lt;br /&gt;
initial, and/or the english translation.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! fpl_local_food_part !! fpl_food_part_initials !! fpl_english_food_part&lt;br /&gt;
|-&lt;br /&gt;
| CHIPUKIZA      || C  || SHOOTS&lt;br /&gt;
|-&lt;br /&gt;
| MAJANI         || J  || LEAVES&lt;br /&gt;
|-&lt;br /&gt;
| MBEGU          || MB || SEEDS&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || W  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| MABUA          || B  || PITH&lt;br /&gt;
|-&lt;br /&gt;
| MAGOMA         || G  || BARK&lt;br /&gt;
|-&lt;br /&gt;
| MATUNDA        || T  || FRUIT&lt;br /&gt;
|-&lt;br /&gt;
| MAUA           || M  || FLOWERS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVI         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU WENGINE || D  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UTOMVU         || U  || SAP&lt;br /&gt;
|-&lt;br /&gt;
| MCHWA          || W  || TERMITES&lt;br /&gt;
|-&lt;br /&gt;
| MIFUPA         || NA || BONES&lt;br /&gt;
|-&lt;br /&gt;
| MITI           || NA || TREE&lt;br /&gt;
|-&lt;br /&gt;
| MIZIZI         || NA || ROOTS&lt;br /&gt;
|-&lt;br /&gt;
| NA             || NA || NOT APPLICABLE&lt;br /&gt;
|-&lt;br /&gt;
| NONE           || NA || None&lt;br /&gt;
|-&lt;br /&gt;
| NYAMA          || N  || MEAT&lt;br /&gt;
|-&lt;br /&gt;
| SIAFU          || S  || INSECTS&lt;br /&gt;
|-&lt;br /&gt;
| UNRECORDED     || NA || UNRECORDED&lt;br /&gt;
|-&lt;br /&gt;
| WADUDU         || D  || INSECTS&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
- there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
- need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and&lt;br /&gt;
  english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
- the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
- different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
- `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
- how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=609</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=609"/>
		<updated>2026-06-02T23:10:01Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #86 There are numerous instances of local food parts that that conflate name, initial, and/or the english translation.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#86) There are numerous instances of local food parts that that conflate name, initial, and/or the english translation. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are numerous instances of local food parts that that conflate name,&lt;br /&gt;
initial, and/or the english translation.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
==== query ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH source_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ROW_NUMBER() OVER () AS src_ord,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_local_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS local_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_food_part_initials, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS initials_arr,&lt;br /&gt;
&lt;br /&gt;
        COALESCE((&lt;br /&gt;
            SELECT ARRAY_AGG(tok)&lt;br /&gt;
            FROM (&lt;br /&gt;
                SELECT BTRIM(x) AS tok&lt;br /&gt;
                FROM REGEXP_SPLIT_TO_TABLE(COALESCE(fpl_english_food_part, &amp;#039;&amp;#039;), E&amp;#039;[;:,]&amp;#039;) AS x&lt;br /&gt;
                WHERE BTRIM(x) &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
            ) s&lt;br /&gt;
        ), ARRAY[]::text[]) AS english_arr&lt;br /&gt;
&lt;br /&gt;
    FROM clean.food_part_lookup&lt;br /&gt;
),&lt;br /&gt;
expanded AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        s.src_ord,&lt;br /&gt;
        gs.idx,&lt;br /&gt;
        COALESCE(s.local_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_local_food_part,&lt;br /&gt;
        COALESCE(s.initials_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_food_part_initials,&lt;br /&gt;
        COALESCE(s.english_arr[gs.idx], &amp;#039;&amp;#039;) AS fpl_english_food_part&lt;br /&gt;
    FROM source_rows s&lt;br /&gt;
    CROSS JOIN LATERAL GENERATE_SERIES(&lt;br /&gt;
        1,&lt;br /&gt;
        GREATEST(&lt;br /&gt;
            CARDINALITY(s.local_arr),&lt;br /&gt;
            CARDINALITY(s.initials_arr),&lt;br /&gt;
            CARDINALITY(s.english_arr)&lt;br /&gt;
        )&lt;br /&gt;
    ) AS gs(idx)&lt;br /&gt;
),&lt;br /&gt;
deduped AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        e.*,&lt;br /&gt;
        ROW_NUMBER() OVER (&lt;br /&gt;
            PARTITION BY&lt;br /&gt;
                e.fpl_local_food_part,&lt;br /&gt;
                e.fpl_food_part_initials,&lt;br /&gt;
                e.fpl_english_food_part&lt;br /&gt;
            ORDER BY e.src_ord, e.idx&lt;br /&gt;
        ) AS rn&lt;br /&gt;
    FROM expanded e&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part&lt;br /&gt;
FROM deduped&lt;br /&gt;
WHERE rn = 1&lt;br /&gt;
ORDER BY src_ord, idx&lt;br /&gt;
    fpl_local_food_part,&lt;br /&gt;
    fpl_food_part_initials,&lt;br /&gt;
    fpl_english_food_part;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== summary ====&lt;br /&gt;
&lt;br /&gt;
|fpl_local_food_part |fpl_food_part_initials |fpl_english_food_part |&lt;br /&gt;
|:-------------------|:----------------------|:---------------------|&lt;br /&gt;
|CHIPUKIZA           |C                      |SHOOTS                |&lt;br /&gt;
|MAJANI              |J                      |LEAVES                |&lt;br /&gt;
|MBEGU               |MB                     |SEEDS                 |&lt;br /&gt;
|WADUDU              |W                      |INSECTS               |&lt;br /&gt;
|MABUA               |B                      |PITH                  |&lt;br /&gt;
|MAGOMA              |G                      |BARK                  |&lt;br /&gt;
|MATUNDA             |T                      |FRUIT                 |&lt;br /&gt;
|MAUA                |M                      |FLOWERS               |&lt;br /&gt;
|UTOMVI              |U                      |SAP                   |&lt;br /&gt;
|WADUDU WENGINE      |D                      |INSECTS               |&lt;br /&gt;
|UTOMVU              |U                      |SAP                   |&lt;br /&gt;
|MCHWA               |W                      |TERMITES              |&lt;br /&gt;
|MIFUPA              |NA                     |BONES                 |&lt;br /&gt;
|MITI                |NA                     |TREE                  |&lt;br /&gt;
|MIZIZI              |NA                     |ROOTS                 |&lt;br /&gt;
|NA                  |NA                     |NOT APPLICABLE        |&lt;br /&gt;
|NONE                |NA                     |None                  |&lt;br /&gt;
|NYAMA               |N                      |MEAT                  |&lt;br /&gt;
|SIAFU               |S                      |INSECTS               |&lt;br /&gt;
|UNRECORDED          |NA                     |UNRECORDED            |&lt;br /&gt;
|WADUDU              |D                      |INSECTS               |&lt;br /&gt;
&lt;br /&gt;
==== notes ====&lt;br /&gt;
&lt;br /&gt;
- there are two initials (`W`, `D`) for `WADUDU` ~ `INSECTS`&lt;br /&gt;
- need to clarify `WADUDU WENGINE`, which also shares an initial (`D`) and&lt;br /&gt;
  english translation (`INSECTS`) as `WADUDU`&lt;br /&gt;
- the initial `W` is associated with both `WADUDU` and `MCHWA`&lt;br /&gt;
- different spellings for `SAP`: `UTOMVU` and `UTOMVI`&lt;br /&gt;
- `INSECTS` associated with `WADUDU`, `WADUDU WENGINE`, and `SIAFU`&lt;br /&gt;
- how should we treat `NA`, `NONE`, and `UNRECORDED`&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=608</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=608"/>
		<updated>2026-05-29T01:18:21Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #85 there are numerous instances of local food names that translate to multiple scientific food names.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#85) There are numerous instances of local food names that translate to multiple scientific food names. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 47 records documenting instances where a local food name (fl_local_food_name) is associated with more than one food scientific food name (fl_sci_food_name).&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH duplicates AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_sci_food_name,&lt;br /&gt;
    COUNT(*) AS count&lt;br /&gt;
  FROM clean.food_lookup&lt;br /&gt;
  GROUP BY fl_sci_food_name&lt;br /&gt;
  HAVING COUNT(*) &amp;gt; 1&lt;br /&gt;
  )&lt;br /&gt;
  SELECT&lt;br /&gt;
    fl_local_food_name,&lt;br /&gt;
    duplicates.fl_sci_food_name&lt;br /&gt;
FROM clean.food_lookup&lt;br /&gt;
JOIN duplicates ON clean.food_lookup.fl_sci_food_name = duplicates.fl_sci_food_name&lt;br /&gt;
ORDER BY fl_sci_food_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=607</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=607"/>
		<updated>2026-05-28T19:40:51Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #84 Some follows have a community with trailing spaces&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#84) Some follows have a community with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 follows where the fol_cl_community_id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || fol_cl_community_id || &amp;#039;&amp;quot;&amp;#039; as problem_id,&lt;br /&gt;
  *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_cl_community_id) &amp;lt;&amp;gt; fol_cl_community_id;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=606</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=606"/>
		<updated>2026-05-26T15:53:51Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #74 minor formatting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=605</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=605"/>
		<updated>2026-05-25T21:18:08Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #83 add a processing note regarding not strict error message regarding this issue.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
&lt;br /&gt;
This data problem generates the following error but note that, in practice, this error could be generated for other reasons as well.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./load_chunks.sh load_aggressions.m4 clean.aggression_event&lt;br /&gt;
psql:&amp;lt;stdin&amp;gt;:394: ERROR:  duplicate key value violates unique constraint &amp;quot;On ROLES, Participant + EID must be unique&amp;quot;&lt;br /&gt;
DETAIL:  Key (participant, eid)=(FD, 759513) already exists.&lt;br /&gt;
CONTEXT:  SQL statement &amp;quot;INSERT INTO roles (&lt;br /&gt;
        eid&lt;br /&gt;
      , role&lt;br /&gt;
      , participant)&lt;br /&gt;
    VALUES (&lt;br /&gt;
      CURRVAL(&amp;#039;events_eid_seq&amp;#039;)&lt;br /&gt;
    , &amp;#039;Actee&amp;#039;&lt;br /&gt;
    , this_ae.ae_b_recipient_id)&amp;quot;&lt;br /&gt;
PL/pgSQL function inline_code_block line 186 at SQL statement&lt;br /&gt;
make: *** [Makefile:373: load_aggressions] Error 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=604</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=604"/>
		<updated>2026-05-25T21:14:16Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #83 remove row order from bad data select&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=603</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=603"/>
		<updated>2026-05-25T21:06:18Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #83 there are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#83) There are AGGRESSION_EVENT rows where the Actor and Actee have the same animal ID. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 363 AGGRESSION_EVENT rows where `ae_b_aggressor_id` and `ae_b_recipient_id` share the same animal id.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH potential_role_dupes AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      ae.ae_b_aggressor_id,&lt;br /&gt;
      ae.ae_b_recipient_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description,&lt;br /&gt;
      ae.ae_comments,&lt;br /&gt;
      ae.dup,&lt;br /&gt;
      ROW_NUMBER() OVER (&lt;br /&gt;
        ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time&lt;br /&gt;
      ) AS source_order&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND ae.ae_b_aggressor_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    source_order,&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments,&lt;br /&gt;
    dup&lt;br /&gt;
FROM potential_role_dupes&lt;br /&gt;
WHERE ae_b_aggressor_id = ae_b_recipient_id&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=602</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=602"/>
		<updated>2026-05-20T22:00:17Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Clarifying note regarding problem #82.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)]. Note that all non-compliant values are NULL or some variation of white space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=601</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=601"/>
		<updated>2026-05-20T21:52:55Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Refactor problem #82 to highlight the issue of white spaces.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;))]. Note that most (maybe all) non-compliant values are NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Fine-grain assessment of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
Note the &amp;#039;&amp;#039;white space&amp;#039;&amp;#039; problem.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select distinct &amp;#039;&amp;quot;&amp;#039; || ae_fight_category || &amp;#039;&amp;quot;&amp;#039; from clean.aggression_event ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=600</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=600"/>
		<updated>2026-05-20T21:42:06Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #82 there are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#82) There are AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 2,041 AGGRESSION_EVENT rows where severity (ae_fight_category) is not an allowable value [(&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;))]. Note that most (maybe all) non-compliant values are NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_fight_category,&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS normalized_fight_category,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    ae_comments&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;)) AS offending_value,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fight_category IS NOT NULL&lt;br /&gt;
  AND (&lt;br /&gt;
       BTRIM(ae_fight_category) = &amp;#039;&amp;#039;&lt;br /&gt;
       OR BTRIM(ae_fight_category) NOT IN (&amp;#039;unrated&amp;#039;, &amp;#039;0&amp;#039;, &amp;#039;1&amp;#039;, &amp;#039;2&amp;#039;, &amp;#039;?&amp;#039;)&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_fight_category, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, offending_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=599</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=599"/>
		<updated>2026-05-20T21:13:30Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Refactor problem #81 to consider the lower and upper bounds of the allowable event window.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 112 AGGRESSION_EVENT rows where ae_time is NULL or outside the allowable window (&amp;#039;04:00:00&amp;#039;, &amp;#039;20:00:00&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN ae_time IS NULL THEN &amp;#039;null_time&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time THEN &amp;#039;below_min&amp;#039;&lt;br /&gt;
      WHEN ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time THEN &amp;#039;above_max&amp;#039;&lt;br /&gt;
    END AS time_issue&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
   OR ae_time &amp;gt; &amp;#039;20:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=598</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=598"/>
		<updated>2026-05-20T21:01:55Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #81 there are AGGRESSION_EVENT rows where ae_time is NULL or less than the earliest allowable value (&amp;#039;04:00:00&amp;#039;).&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#81) There are AGGRESSION_EVENT rows where ae_time is NULL or less than the earliest allowable value (&amp;#039;04:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 108 AGGRESSION_EVENT rows where ae_time is NULL or less than the earliest allowable value (&amp;#039;04:00:00&amp;#039;). ==&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_date,&lt;br /&gt;
    ae_time,&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    ae_b_aggressor_id,&lt;br /&gt;
    ae_b_recipient_id,&lt;br /&gt;
    ae_source,&lt;br /&gt;
    ae_full_description&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_time IS NULL&lt;br /&gt;
   OR ae_time &amp;lt; &amp;#039;04:00:00&amp;#039;::time&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=597</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=597"/>
		<updated>2026-05-20T20:26:51Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #80 there are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#80) There are AGGRESSION_EVENT rows where ae_aggressor_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 143 records where ae_aggressor_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_aggressor_behavior IS NULL&lt;br /&gt;
   OR BTRIM(ae_aggressor_behavior) = &amp;#039;&amp;#039;&lt;br /&gt;
ORDER BY ae_date, ae_fol_b_focal_id, ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=596</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=596"/>
		<updated>2026-05-20T20:06:03Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Updates problem #79 to detail how white space and valid follows are treated.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data. Note that the query strips white space when comparing the ids of aggression participant ids and biography animid ids given that the focus here is on checking for events outside of defined date ranges rather than matching ids and assuming that issues concerning white space will be resolved (also in the migration work-around). The query further winnows aggression events for which there is a valid follow record (not in the migration work-around).&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=595</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=595"/>
		<updated>2026-05-20T19:48:35Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Refactor problem #76 to consider whitespace when comparing aggression and biography ids&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 10,866 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
* Whitespace is considered in the incongruence such that, for example, `VIN ` (with a space) does not match `VIN` in biography.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;trimmed_match&amp;#039;&amp;#039; denotes a match between Actor/Actee and biography if whitespace is trimmed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant AS raw_participant,&lt;br /&gt;
    b_trimmed.b_animid AS trimmed_match,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, ae.ae_b_aggressor_id),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, ae.ae_b_recipient_id)&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
LEFT JOIN clean.biography b_trimmed&lt;br /&gt;
  ON BTRIM(v.participant) = BTRIM(b_trimmed.b_animid)&lt;br /&gt;
WHERE v.participant IS NOT NULL&lt;br /&gt;
  AND v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY v.role_name, v.participant, b_trimmed.b_animid&lt;br /&gt;
ORDER BY row_count DESC, v.role_name, v.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=594</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=594"/>
		<updated>2026-05-19T21:12:10Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #79 (#79) there are AGGRESSION_EVENT participants outside their valid study participation window in the biography data.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#79) There are AGGRESSION_EVENT participants outside their valid study participation window in the biography data. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 46 records where an Actor and/or Actee are in the AGGRESSION_EVENT table but outside their valid study participation window in the biography data.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH candidate_agg AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      ae.ae_date,&lt;br /&gt;
      ae.ae_time,&lt;br /&gt;
      ae.ae_fol_b_focal_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS actor_id,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS actee_id,&lt;br /&gt;
      ae.ae_source,&lt;br /&gt;
      ae.ae_full_description&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.follow f&lt;br /&gt;
          WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
            AND f.fol_date     = ae.ae_date&lt;br /&gt;
        )&lt;br /&gt;
    AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
    AND EXISTS (&lt;br /&gt;
          SELECT 1&lt;br /&gt;
          FROM clean.biography b&lt;br /&gt;
          WHERE b.b_animid = BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;))&lt;br /&gt;
        )&lt;br /&gt;
),&lt;br /&gt;
participants AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actor&amp;#039; AS role,&lt;br /&gt;
      c.actor_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT&lt;br /&gt;
      c.ae_date,&lt;br /&gt;
      c.ae_time,&lt;br /&gt;
      c.ae_fol_b_focal_id,&lt;br /&gt;
      &amp;#039;Actee&amp;#039; AS role,&lt;br /&gt;
      c.actee_id AS participant,&lt;br /&gt;
      c.ae_source,&lt;br /&gt;
      c.ae_full_description&lt;br /&gt;
  FROM candidate_agg c&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    p.ae_date,&lt;br /&gt;
    p.ae_time,&lt;br /&gt;
    p.ae_fol_b_focal_id,&lt;br /&gt;
    p.role,&lt;br /&gt;
    p.participant,&lt;br /&gt;
    b.b_entrydate,&lt;br /&gt;
    b.b_departdate,&lt;br /&gt;
    CASE&lt;br /&gt;
      WHEN p.ae_date &amp;lt; b.b_entrydate THEN &amp;#039;before_entry&amp;#039;&lt;br /&gt;
      WHEN p.ae_date &amp;gt; b.b_departdate THEN &amp;#039;after_departure&amp;#039;&lt;br /&gt;
      ELSE &amp;#039;ok&amp;#039;&lt;br /&gt;
    END AS violation_type,&lt;br /&gt;
    p.ae_source,&lt;br /&gt;
    p.ae_full_description&lt;br /&gt;
FROM participants p&lt;br /&gt;
JOIN clean.biography b&lt;br /&gt;
  ON b.b_animid = p.participant&lt;br /&gt;
WHERE p.ae_date &amp;lt; b.b_entrydate&lt;br /&gt;
   OR p.ae_date &amp;gt; b.b_departdate&lt;br /&gt;
ORDER BY p.ae_date, p.ae_fol_b_focal_id, p.ae_time, p.role, p.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=588</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=588"/>
		<updated>2026-05-19T18:47:01Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #78 there are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=587</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=587"/>
		<updated>2026-05-19T18:04:12Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #77 There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=586</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=586"/>
		<updated>2026-05-19T03:13:33Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #76 there are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=585</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=585"/>
		<updated>2026-05-19T00:53:15Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #72 added a candidate solution.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=584</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=584"/>
		<updated>2026-05-18T23:13:43Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #75 there are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=583</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=583"/>
		<updated>2026-05-18T22:02:23Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #74 request confirmation&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=582</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=582"/>
		<updated>2026-05-18T20:39:37Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: fix Problem #74 formatting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039;&amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=581</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=581"/>
		<updated>2026-05-18T20:37:51Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #74 there are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;\pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of offending values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;\pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039;&amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=580</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=580"/>
		<updated>2026-05-17T22:17:12Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #73 there are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=579</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=579"/>
		<updated>2026-05-17T20:39:13Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #72 There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required).&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=578</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=578"/>
		<updated>2026-05-17T20:27:22Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: minor update to formatting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=577</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=577"/>
		<updated>2026-05-17T18:41:13Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: fix Problem #71 Wiki formatting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are aggression events that do not have a matching follow for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are aggression animal ids that are not reflected in among focal animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) ae_fol_b_focal_id records that are not reflected among fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address these issues as well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=576</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=576"/>
		<updated>2026-05-17T18:39:45Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Problem #71: There are aggression animal ids that are not reflected in among focal animal ids.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are aggression events that do not have a matching follow for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are aggression animal ids that are not reflected in among focal animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) ae_fol_b_focal_id records that are not reflected among fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
== Processing note ==&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address these issues as well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=575</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=575"/>
		<updated>2026-05-17T18:30:18Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: adding Problem #70: aggression events that do not have a matching follow for the same focal ID and date.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are aggression events that do not have a matching follow for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=574</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=574"/>
		<updated>2026-05-16T01:15:47Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: add problem #69 There are duplicate year*community records in AGGRESSION_EVENT_LOG&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=563</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=563"/>
		<updated>2026-05-11T18:59:44Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Add commit tag that resolves problem #63.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=562</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=562"/>
		<updated>2026-05-11T18:58:54Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: Add commit tag that resolves problem #62&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=561</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=561"/>
		<updated>2026-05-08T16:48:09Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=560</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=560"/>
		<updated>2026-05-08T02:48:34Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE and GGl for sure, and possibly FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=559</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=559"/>
		<updated>2026-05-08T01:26:11Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: FAC, FAR, GGl, GU, Sl, and UWE (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=558</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=558"/>
		<updated>2026-05-08T01:25:38Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
SRE: still need to address: FAC, FAR, GGl, GU, Sl, and UWE (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=554</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=554"/>
		<updated>2026-05-05T20:58:51Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: comment on error 43 number of records&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand. But note that the error count decreased substantially with the 2026-04-14 dump.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=553</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=553"/>
		<updated>2026-05-05T20:57:26Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: problem 43 error count 732 -&amp;gt; 276&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Installing_pg_isok&amp;diff=482</id>
		<title>Installing pg isok</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Installing_pg_isok&amp;diff=482"/>
		<updated>2025-12-15T19:48:46Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;pg_isok is &amp;quot;The Warning System&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== When to Install ==&lt;br /&gt;
&lt;br /&gt;
Pg_isok is installed in its own schema, isolated from all of SokweDB&amp;#039;s schemas.  It only need be installed once, upon database creation.  Removing or re-installing SokweDB&amp;#039;s schemas using the build tools do not interfere with the content of Pg_isok&amp;#039;s schema.&lt;br /&gt;
&lt;br /&gt;
Of course, if an entire database is dropped, re-created, and restored from backup, pg_isok will also need to be restored.  Depending on the situation, this may involve reinstalling pg_isok and then restoring the content of its tables.&lt;br /&gt;
&lt;br /&gt;
== Installing The Warning System (pg_isok) ==&lt;br /&gt;
&lt;br /&gt;
The [https://kop.codeberg.page/pg_isok_docs pg_isok] PostgreSQL extension is used as SokweDB&amp;#039;s &amp;quot;Warning System&amp;quot;.  Because SokweDB is in the cloud, pg_isok must be installed from a SQL script.&lt;br /&gt;
&lt;br /&gt;
To generate the SQL needed to install pg_isok, follow the cloud instructions found in the [https://kop.codeberg.page/pg_isok_docs pg_isok documentation].&lt;br /&gt;
SokweDB expects installation in a schema named &amp;lt;code&amp;gt;isok&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Install pg_isok in each database.&lt;br /&gt;
&lt;br /&gt;
After installation, permissions must be granted. &lt;br /&gt;
&lt;br /&gt;
The following example shows the steps involved to install pg_isok version 0.1.4, using &amp;lt;code&amp;gt;psql&amp;lt;/code&amp;gt;, and to grant the expected permissions.  (The name of the SQL file, included with &amp;lt;code&amp;gt;\i&amp;lt;/code&amp;gt;, will vary with the pg_isok version and the relative location of the sql file.)&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;code&amp;gt;&lt;br /&gt;
 SET ROLE TO admin;&lt;br /&gt;
 CREATE SCHEMA isok;&lt;br /&gt;
 GRANT USAGE ON SCHEMA isok TO reader;&lt;br /&gt;
 GRANT USAGE ON SCHEMA isok TO writer;&lt;br /&gt;
 &lt;br /&gt;
 \i ../pg_isok_cloud--0.1.4.sql&lt;br /&gt;
 &lt;br /&gt;
 -- IQ_TYPES&lt;br /&gt;
 GRANT SELECT ON isok.iq_types TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.iq_types TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- IR_TYPES&lt;br /&gt;
 GRANT SELECT ON isok.ir_types TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.ir_types TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- ISOK_QUERIES&lt;br /&gt;
 GRANT SELECT ON isok.isok_queries TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.isok_queries TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- ISOK_RESULTS&lt;br /&gt;
 GRANT SELECT ON isok.isok_results TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.isok_results TO writer;&lt;br /&gt;
 &lt;br /&gt;
 GRANT SELECT ON isok.isok_results_irid_seq TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_results_irid_seq TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_results_irid_seq TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- run_isok_queries()&lt;br /&gt;
 GRANT EXECUTE ON FUNCTION isok.run_isok_queries() TO writer;&lt;br /&gt;
 GRANT EXECUTE ON FUNCTION isok.run_isok_queries(TEXT) TO writer;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== updating pg_isok ==&lt;br /&gt;
&lt;br /&gt;
Installing or updating pg_isok requires a clean isok schema. As such, a backup of warn queries and results that can be reloaded after updating pg_isok is needed if that information should be maintained. The procedure for updating pg_isok is similar to those for installation with the additional steps of first creating a data-only dump of isok schema contents prior to dropping then creating the schema, and restoring those contents once the update is complete. As with installation, these steps need to be addressed separately for each database.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;pg_dump -h HOST -d DATABASE -n isok -Fc --data-only &amp;gt; isok_schema_contents&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
within psql:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;DROP SCHEMA isok CASCADE;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Follow the installation steps above to (re)install pg_isok then restore the dumped contents to repopulate the schema:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;pg_restore -h HOST -d DATABASE -n isok --data-only isok_schema_contents&amp;lt;/code&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=PostgreSQL_Administration&amp;diff=481</id>
		<title>PostgreSQL Administration</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=PostgreSQL_Administration&amp;diff=481"/>
		<updated>2025-12-15T19:38:36Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: now refer isok administration to the installing isok page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Killing long-running queries ==&lt;br /&gt;
&lt;br /&gt;
This section is about stopping, that is &amp;#039;&amp;#039;killing&amp;#039;&amp;#039;, queries that should not be running.&lt;br /&gt;
&lt;br /&gt;
Because users run SQL queries, they may inadvertently execute erroneous queries.&lt;br /&gt;
These queries usually run for excessive amounts of time and produce extremely large result sets.&lt;br /&gt;
&lt;br /&gt;
Because SokweDB runs in the cloud and cloud billing is usage-based, in addition to slowing down the system, these queries can cost money.&lt;br /&gt;
&lt;br /&gt;
=== Problem overview ===&lt;br /&gt;
&lt;br /&gt;
It is relatively easy to write such a query, if you&lt;br /&gt;
 &amp;lt;code&amp;gt;SELECT ...&lt;br /&gt;
   FROM tablea, tableb, tablec ...&amp;lt;/code&amp;gt;&lt;br /&gt;
and do not supply any &amp;lt;code&amp;gt;WHERE&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;JOIN&amp;lt;/code&amp;gt; conditions,&lt;br /&gt;
the result will be the cross product of all rows of all tables.  In other words, each row of each table will be paired up with every row of every other table, producing A times B times C number of output rows, where A, B, and C are the number of rows in &amp;lt;code&amp;gt;tablea&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;tableb&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;tablec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Some of the generic database interfaces may have ways to monitor the database backend to discover and kill such bad queries.&lt;br /&gt;
Alternately, use the [[#Finding a query to kill|manual process]] below.&lt;br /&gt;
&lt;br /&gt;
=== Permissions required ===&lt;br /&gt;
&lt;br /&gt;
No matter the user interface used, the permission requirements are the same:&lt;br /&gt;
Any user can kill their own queries.&lt;br /&gt;
Only [https://sokwe.janegoodall.org/doc/tech_spec/architecture/permissions/#the-administrator-permission-levels an administrator] can kill other user&amp;#039;s queries.&lt;br /&gt;
&lt;br /&gt;
=== Finding a query to kill ===&lt;br /&gt;
To kill such a query, first find it&amp;#039;s process number, it&amp;#039;s &amp;lt;code&amp;gt;pid&amp;lt;/code&amp;gt;.&lt;br /&gt;
This is found in the &amp;lt;code&amp;gt;pid&amp;lt;/code&amp;gt; column of the following query:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;code&amp;gt;SELECT * FROM pg_stat_activity;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Killing the query ===&lt;br /&gt;
&lt;br /&gt;
The following statement kills the query, where &amp;lt;code&amp;gt;MYPID&amp;lt;/code&amp;gt; is the pid of the query to be killed:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;code&amp;gt;SELECT pg_terminate_backend(MYPID);&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The query may not always immediately stop.&lt;br /&gt;
It is best to check that no mistake was made and the process was actually killed.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Installing Isok, &amp;quot;The Warning System&amp;quot; ==&lt;br /&gt;
&lt;br /&gt;
SokweDB uses [https://kop.codeberg.page/pg_isok_docs Isok] to generate its warnings. Refer to the [[installing pg_isok|Installing_pg_isok]] page for instructions on installing and administering the pg_isok warning system.&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Installing_pg_isok&amp;diff=480</id>
		<title>Installing pg isok</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Installing_pg_isok&amp;diff=480"/>
		<updated>2025-12-15T19:33:07Z</updated>

		<summary type="html">&lt;p&gt;StevanEarl: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;pg_isok is &amp;quot;The Warning System&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== When to Install ==&lt;br /&gt;
&lt;br /&gt;
Pg_isok is installed in its own schema, isolated from all of SokweDB&amp;#039;s schemas.  It only need be installed once, upon database creation.  Removing or re-installing SokweDB&amp;#039;s schemas using the build tools do not interfere with the content of Pg_isok&amp;#039;s schema.&lt;br /&gt;
&lt;br /&gt;
Of course, if an entire database is dropped, re-created, and restored from backup, pg_isok will also need to be restored.  Depending on the situation, this may involve reinstalling pg_isok and then restoring the content of its tables.&lt;br /&gt;
&lt;br /&gt;
== Installing The Warning System (pg_isok) ==&lt;br /&gt;
&lt;br /&gt;
The [https://kop.codeberg.page/pg_isok_docs pg_isok] PostgreSQL extension is used as SokweDB&amp;#039;s &amp;quot;Warning System&amp;quot;.  Because SokweDB is in the cloud, pg_isok must be installed from a SQL script.&lt;br /&gt;
&lt;br /&gt;
To generate the SQL needed to install pg_isok, follow the cloud instructions found in the [https://kop.codeberg.page/pg_isok_docs pg_isok documentation].&lt;br /&gt;
SokweDB expects installation in a schema named &amp;lt;code&amp;gt;isok&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Install pg_isok in each database.&lt;br /&gt;
&lt;br /&gt;
After installation, permissions must be granted. &lt;br /&gt;
&lt;br /&gt;
The following example shows the steps involved to install pg_isok version 0.1.4, using &amp;lt;code&amp;gt;psql&amp;lt;/code&amp;gt;, and to grant the expected permissions.  (The name of the SQL file, included with &amp;lt;code&amp;gt;\i&amp;lt;/code&amp;gt;, will vary with the pg_isok version and the relative location of the sql file.)&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;code&amp;gt;&lt;br /&gt;
 SET ROLE TO admin;&lt;br /&gt;
 CREATE SCHEMA isok;&lt;br /&gt;
 GRANT USAGE ON SCHEMA isok TO reader;&lt;br /&gt;
 GRANT USAGE ON SCHEMA isok TO writer;&lt;br /&gt;
 &lt;br /&gt;
 \i ../pg_isok_cloud--0.1.4.sql&lt;br /&gt;
 &lt;br /&gt;
 -- IQ_TYPES&lt;br /&gt;
 GRANT SELECT ON isok.iq_types TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.iq_types TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.iq_types TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- IR_TYPES&lt;br /&gt;
 GRANT SELECT ON isok.ir_types TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.ir_types TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.ir_types TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- ISOK_QUERIES&lt;br /&gt;
 GRANT SELECT ON isok.isok_queries TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_queries TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.isok_queries TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- ISOK_RESULTS&lt;br /&gt;
 GRANT SELECT ON isok.isok_results TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT INSERT ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_results TO writer;&lt;br /&gt;
 GRANT DELETE ON isok.isok_results TO writer;&lt;br /&gt;
 &lt;br /&gt;
 GRANT SELECT ON isok.isok_results_irid_seq TO reader;&lt;br /&gt;
 GRANT SELECT ON isok.isok_results_irid_seq TO writer;&lt;br /&gt;
 GRANT UPDATE ON isok.isok_results_irid_seq TO writer;&lt;br /&gt;
 &lt;br /&gt;
 -- run_isok_queries()&lt;br /&gt;
 GRANT EXECUTE ON FUNCTION isok.run_isok_queries() TO writer;&lt;br /&gt;
 GRANT EXECUTE ON FUNCTION isok.run_isok_queries(TEXT) TO writer;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== updating pg_isok ==&lt;br /&gt;
&lt;br /&gt;
Installing or updating pg_isok requires a clean isok schema. As such, a backup of warn queries and results that can be reloaded after updating pg_isok is needed if that information should be maintained. The procedure for updating pg_isok is to installation with the additional steps of first creating a data-only dump of isok schema contents prior to dropping then creating the schema, and restoring those contents once the upgrade is complete. As with installation, these steps need to be addressed separately for each database.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;pg_dump -h HOST -d DATABASE -n isok -Fc --data-only &amp;gt; isok_schema_contents&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
within psql:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;DROP SCHEMA isok CASCADE;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Follow the installation steps above to (re)install pg_isok then restore the dumped contents to repopulate the schema:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;pg_restore -h HOST -d DATABASE -n isok --data-only isok_schema_contents&amp;lt;/code&amp;gt;&lt;/div&gt;</summary>
		<author><name>StevanEarl</name></author>
	</entry>
</feed>