{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Gender in Wikipedia\n", "\n", "* We follow the ICWSM 2015 paper: \n", "\n", "### DBpedia\n", "* Information on their downloads as csv \n", "* Downloads: For each class a separate json/csv file \n", "\n", "\n", "### Idea:\n", "* We redo the lexical analysis for wikipedia, but now we \n", "\n", "> consider all persons from wikipedia\n", "\n", "> compare the infoboxes (attributes and values) of males and females\n", "\n", "> (Maybe) compare infoboxes of spouses\n", "\n", "* Comparison: mutual information, or log likelihood \n", "* **Main new problem: how to determine the gender of a person?**\n", "\n", "\n", "### Data exploration\n", "* the difference between a column and the same with \"_label_ is whether its value is a DBpedia link, or a string (without the namespace/url)\n", " * **We can thus remove the \"_label\" columns**\n", "* If we remove Persons without a Birthdate (or without description), we remove more than 1/3rd of the pages.\n", "* Average and median number of attributes with a value = 14\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 1: download the data" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2015-10-19 15:31:16-- http://web.informatik.uni-mannheim.de/DBpediaAsTables/csv/Person.csv.gz\n", "Resolving web.informatik.uni-mannheim.de... 134.155.95.98\n", "Connecting to web.informatik.uni-mannheim.de|134.155.95.98|:80... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 603469009 (576M) [application/x-gzip]\n", "Saving to: `Person.csv.gz'\n", "\n", "100%[======================================>] 603,469,009 6.82M/s in 95s \n", "\n", "2015-10-19 15:32:51 (6.07 MB/s) - `Person.csv.gz' saved [603469009/603469009]\n", "\n" ] } ], "source": [ "!wget http://web.informatik.uni-mannheim.de/DBpediaAsTables/csv/Person.csv.gz" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!gunzip Person.csv.gz " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-rw-r--r-- 1 admin staff 8.6G Sep 24 2014 Person.csv\r\n" ] } ], "source": [ "# This is pretty big.....\n", "!ls -lh Person.csv " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.6 M persons\n", "```\n", "wcw-staff-145-18-165-76:Data admin$ wc -l Person.csv \n", " 1649650 Person.csv\n", "``` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 2 Explore with a small sample\n", "\n", "* Just get to know the data, and refresh our pandas knowledge\n", "* we work with 10K lines" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# let's make a bit smaller one for testing\n", "#! head -10000 Person.csv > Person_10K.csv" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"URI\",\"rdf-schema#label\",\"rdf-schema#comment\",\"timeInSpace\",\"wheelbase\",\"length\",\"width\",\"height\",\"weight\",\"runtime\",\"academicAdvisor_label\",\"academicAdvisor\",\"activeYearsEndDate\",\"activeYearsEndYear\",\"activeYearsStartDate\",\"activeYearsStartYear\",\"album_label\",\"album\",\"alias\",\"allegiance\",\"almaMater_label\",\"almaMater\",\"appointer_label\",\"appointer\",\"artist_label\",\"artist\",\"associate_label\",\"associate\",\"associatedAct_label\",\"associatedAct\",\"associatedBand_label\",\"associatedBand\",\"associatedMusicalArtist_label\",\"associatedMusicalArtist\",\"author_label\",\"author\",\"automobileModel\",\"award_label\",\"award\",\"background\",\"battingSide\",\"battle_label\",\"battle\",\"beatifiedBy_label\",\"beatifiedBy\",\"beatifiedDate\",\"beatifiedPlace_label\",\"beatifiedPlace\",\"bestFinish\",\"bibsysId\",\"billed_label\",\"billed\",\"birthDate\",\"birthName\",\"birthPlace_label\",\"birthPlace\",\"birthYear\",\"block\",\"bnfId\",\"board_label\",\"board\",\"bodyDiscovered_label\",\"bodyDiscovered\",\"bowlRecord\",\"bpnId\",\"buriedPlace_label\",\"buriedPlace\",\"bustSize\",\"callSign\",\"canonizedBy_label\",\"canonizedBy\",\"canonizedDate\",\"canonizedPlace_label\",\"canonizedPlace\",\"careerPoints\",\"careerPrizeMoney\",\"careerStation_label\",\"careerStation\",\"centuryBreaks\",\"chairman_label\",\"chairman\",\"championships\",\"chancellor_label\",\"chancellor\",\"child_label\",\"child\",\"choreographer_label\",\"choreographer\",\"citizenship_label\",\"citizenship\",\"club_label\",\"club\",\"coach_label\",\"coach\",\"coachedTeam_label\",\"coachedTeam\",\"coachingRecord\",\"college_label\",\"college\",\"committee\",\"commonName\",\"configuration\",\"country_label\",\"country\",\"creator_label\",\"creator\",\"currentMember_label\",\"currentMember\",\"currentPartner_label\",\"currentPartner\",\"currentRank\",\"currentRecord\",\"dateOfBurial\",\"deathCause_label\",\"deathCause\",\"deathDate\",\"deathPlace_label\",\"deathPlace\",\"deathYear\",\"debut\",\"debutTeam_label\",\"debutTeam\",\"depictionDescription\",\"deputy_label\",\"deputy\",\"discipline_label\",\"discipline\",\"doctoralAdvisor_label\",\"doctoralAdvisor\",\"doctoralStudent_label\",\"doctoralStudent\",\"draft\",\"draftPick\",\"draftRound\",\"draftTeam_label\",\"draftTeam\",\"draftYear\",\"education_label\",\"education\",\"electionDate\",\"electionMajority\",\"employer_label\",\"employer\",\"endYearOfInsertion\",\"engine_label\",\"engine\",\"era_label\",\"era\",\"espnId\",\"ethnicity_label\",\"ethnicity\",\"event_label\",\"event\",\"eyeColor\",\"fastestLap\",\"feastDay\",\"field_label\",\"field\",\"firstAppearance\",\"firstRace_label\",\"firstRace\",\"firstWin_label\",\"firstWin\",\"format_label\",\"format\",\"formerChoreographer_label\",\"formerChoreographer\",\"formerCoach_label\",\"formerCoach\",\"formerHighschool_label\",\"formerHighschool\",\"formerPartner_label\",\"formerPartner\",\"formerTeam_label\",\"formerTeam\",\"garrison_label\",\"garrison\",\"gender_label\",\"gender\",\"genre_label\",\"genre\",\"governor_label\",\"governor\",\"governorGeneral_label\",\"governorGeneral\",\"hairColor\",\"hallOfFame\",\"height\",\"heir_label\",\"heir\",\"highestBreak\",\"highestRank\",\"highschool_label\",\"highschool\",\"hipSize\",\"hometown_label\",\"hometown\",\"honours_label\",\"honours\",\"incumbent_label\",\"incumbent\",\"individualisedGnd\",\"influenced_label\",\"influenced\",\"influencedBy_label\",\"influencedBy\",\"institution_label\",\"institution\",\"instrument_label\",\"instrument\",\"isniId\",\"knownFor_label\",\"knownFor\",\"language_label\",\"language\",\"lastAppearance_label\",\"lastAppearance\",\"lastPosition\",\"lastRace_label\",\"lastRace\",\"lastWin_label\",\"lastWin\",\"lccnId\",\"league_label\",\"league\",\"length\",\"lieutenant_label\",\"lieutenant\",\"location_label\",\"location\",\"mainInterest_label\",\"mainInterest\",\"majorShrine_label\",\"majorShrine\",\"majorityLeader\",\"manager_label\",\"manager\",\"managerClub_label\",\"managerClub\",\"manufacturer_label\",\"manufacturer\",\"mbaId\",\"militaryBranch_label\",\"militaryBranch\",\"militaryCommand\",\"militaryRank_label\",\"militaryRank\",\"militaryUnit_label\",\"militaryUnit\",\"mission_label\",\"mission\",\"monarch_label\",\"monarch\",\"movement_label\",\"movement\",\"mythology_label\",\"mythology\",\"nationalTeam_label\",\"nationalTeam\",\"nationality_label\",\"nationality\",\"network_label\",\"network\",\"networth\",\"nlaId\",\"nominee_label\",\"nominee\",\"notableIdea_label\",\"notableIdea\",\"notableStudent_label\",\"notableStudent\",\"notableWork_label\",\"notableWork\",\"number\",\"numberBuilt\",\"numberOfEpisodes\",\"numberOfFilms\",\"occupation_label\",\"occupation\",\"office\",\"opponent_label\",\"opponent\",\"orcidId\",\"orderInOffice\",\"origin_label\",\"origin\",\"otherParty_label\",\"otherParty\",\"otherWins\",\"overallRecord\",\"parent_label\",\"parent\",\"partner_label\",\"partner\",\"party_label\",\"party\",\"personFunction_label\",\"personFunction\",\"philosophicalSchool_label\",\"philosophicalSchool\",\"picture_label\",\"picture\",\"placeOfBurial_label\",\"placeOfBurial\",\"plays\",\"podiums\",\"poles\",\"portfolio\",\"portrayer_label\",\"portrayer\",\"position_label\",\"position\",\"predecessor_label\",\"predecessor\",\"presenter_label\",\"presenter\",\"president_label\",\"president\",\"previousWork_label\",\"previousWork\",\"priceMoney\",\"primeMinister_label\",\"primeMinister\",\"producer_label\",\"producer\",\"productionEndYear\",\"productionStartYear\",\"profession_label\",\"profession\",\"prospectLeague_label\",\"prospectLeague\",\"prospectTeam_label\",\"prospectTeam\",\"pseudonym\",\"publisher_label\",\"publisher\",\"race_label\",\"race\",\"raceHorse_label\",\"raceHorse\",\"races\",\"rankingWins\",\"rating\",\"recordDate\",\"recordLabel_label\",\"recordLabel\",\"recordedIn_label\",\"recordedIn\",\"region_label\",\"region\",\"relation_label\",\"relation\",\"relative_label\",\"relative\",\"releaseDate\",\"religion_label\",\"religion\",\"residence_label\",\"residence\",\"restingPlace_label\",\"restingPlace\",\"restingPlacePosition\",\"ridId\",\"royalAnthem_label\",\"royalAnthem\",\"runningMate_label\",\"runningMate\",\"runtime\",\"salary\",\"school_label\",\"school\",\"selection_label\",\"selection\",\"selibrId\",\"seniority\",\"series_label\",\"series\",\"serviceEndYear\",\"serviceNumber\",\"serviceStartYear\",\"shoeNumber\",\"shoots\",\"significantBuilding_label\",\"significantBuilding\",\"significantDesign_label\",\"significantDesign\",\"significantProject_label\",\"significantProject\",\"speaker\",\"species_label\",\"species\",\"spike\",\"sport_label\",\"sport\",\"sportCountry_label\",\"sportCountry\",\"spouse_label\",\"spouse\",\"squadNumber\",\"starring_label\",\"starring\",\"startYearOfInsertion\",\"stateDelegate_label\",\"stateDelegate\",\"stateOfOrigin_label\",\"stateOfOrigin\",\"statisticLabel_label\",\"statisticLabel\",\"statisticValue\",\"statisticYear\",\"status\",\"subsequentWork_label\",\"subsequentWork\",\"successor_label\",\"successor\",\"supplementalDraftRound\",\"supplementalDraftYear\",\"suppreddedDate\",\"team_label\",\"team\",\"televisionSeries_label\",\"televisionSeries\",\"termPeriod_label\",\"termPeriod\",\"throwingSide\",\"thumbnail_label\",\"thumbnail\",\"timeInSpace\",\"title\",\"tournamentRecord\",\"trackNumber\",\"trainer_label\",\"trainer\",\"training_label\",\"training\",\"transmission\",\"type_label\",\"type\",\"ulanId\",\"undraftedYear\",\"unitCost\",\"university_label\",\"university\",\"veneratedIn_label\",\"veneratedIn\",\"viafId\",\"vicePresident_label\",\"vicePresident\",\"vicePrimeMinister_label\",\"vicePrimeMinister\",\"voice_label\",\"voice\",\"voiceType_label\",\"voiceType\",\"waistSize\",\"weight\",\"whaDraft\",\"whaDraftTeam_label\",\"whaDraftTeam\",\"wheelbase\",\"width\",\"wikiPageDisambiguates_label\",\"wikiPageDisambiguates\",\"wikiPageID\",\"wikiPageRedirects_label\",\"wikiPageRedirects\",\"wikiPageRevisionID\",\"wins\",\"winsAtAsia_label\",\"winsAtAsia\",\"winsAtAus_label\",\"winsAtAus\",\"winsAtChallenges_label\",\"winsAtChallenges\",\"winsAtChampionships_label\",\"winsAtChampionships\",\"winsAtJapan_label\",\"winsAtJapan\",\"winsAtLET_label\",\"winsAtLET\",\"winsAtLPGA_label\",\"winsAtLPGA\",\"winsAtMajors_label\",\"winsAtMajors\",\"winsAtNWIDE_label\",\"winsAtNWIDE\",\"winsAtOtherTournaments_label\",\"winsAtOtherTournaments\",\"winsAtPGA_label\",\"winsAtPGA\",\"winsAtProTournaments_label\",\"winsAtProTournaments\",\"winsInEurope_label\",\"winsInEurope\",\"worldChampionTitleYear\",\"writer_label\",\"writer\",\"year\",\"years\",\"description\",\"point\",\"22-rdf-syntax-ns#type_label\",\"22-rdf-syntax-ns#type\",\"owl#sameAs_label\",\"owl#sameAs\",\"wgs84_pos#lat\",\"wgs84_pos#long\",\"core#broader_label\",\"core#broader\",\"core#prefLabel\",\"core#related_label\",\"core#related\",\"core#subject_label\",\"core#subject\"\r\n", "\"URI\",\"http://www.w3.org/2000/01/rdf-schema#label\",\"http://www.w3.org/2000/01/rdf-schema#comment\",\"http://dbpedia.org/ontology/Astronaut/timeInSpace\",\"http://dbpedia.org/ontology/Automobile/wheelbase\",\"http://dbpedia.org/ontology/MeanOfTransportation/length\",\"http://dbpedia.org/ontology/MeanOfTransportation/width\",\"http://dbpedia.org/ontology/Person/height\",\"http://dbpedia.org/ontology/Person/weight\",\"http://dbpedia.org/ontology/Work/runtime\",\"http://dbpedia.org/ontology/academicAdvisor\",\"http://dbpedia.org/ontology/academicAdvisor\",\"http://dbpedia.org/ontology/activeYearsEndDate\",\"http://dbpedia.org/ontology/activeYearsEndYear\",\"http://dbpedia.org/ontology/activeYearsStartDate\",\"http://dbpedia.org/ontology/activeYearsStartYear\",\"http://dbpedia.org/ontology/album\",\"http://dbpedia.org/ontology/album\",\"http://dbpedia.org/ontology/alias\",\"http://dbpedia.org/ontology/allegiance\",\"http://dbpedia.org/ontology/almaMater\",\"http://dbpedia.org/ontology/almaMater\",\"http://dbpedia.org/ontology/appointer\",\"http://dbpedia.org/ontology/appointer\",\"http://dbpedia.org/ontology/artist\",\"http://dbpedia.org/ontology/artist\",\"http://dbpedia.org/ontology/associate\",\"http://dbpedia.org/ontology/associate\",\"http://dbpedia.org/ontology/associatedAct\",\"http://dbpedia.org/ontology/associatedAct\",\"http://dbpedia.org/ontology/associatedBand\",\"http://dbpedia.org/ontology/associatedBand\",\"http://dbpedia.org/ontology/associatedMusicalArtist\",\"http://dbpedia.org/ontology/associatedMusicalArtist\",\"http://dbpedia.org/ontology/author\",\"http://dbpedia.org/ontology/author\",\"http://dbpedia.org/ontology/automobileModel\",\"http://dbpedia.org/ontology/award\",\"http://dbpedia.org/ontology/award\",\"http://dbpedia.org/ontology/background\",\"http://dbpedia.org/ontology/battingSide\",\"http://dbpedia.org/ontology/battle\",\"http://dbpedia.org/ontology/battle\",\"http://dbpedia.org/ontology/beatifiedBy\",\"http://dbpedia.org/ontology/beatifiedBy\",\"http://dbpedia.org/ontology/beatifiedDate\",\"http://dbpedia.org/ontology/beatifiedPlace\",\"http://dbpedia.org/ontology/beatifiedPlace\",\"http://dbpedia.org/ontology/bestFinish\",\"http://dbpedia.org/ontology/bibsysId\",\"http://dbpedia.org/ontology/billed\",\"http://dbpedia.org/ontology/billed\",\"http://dbpedia.org/ontology/birthDate\",\"http://dbpedia.org/ontology/birthName\",\"http://dbpedia.org/ontology/birthPlace\",\"http://dbpedia.org/ontology/birthPlace\",\"http://dbpedia.org/ontology/birthYear\",\"http://dbpedia.org/ontology/block\",\"http://dbpedia.org/ontology/bnfId\",\"http://dbpedia.org/ontology/board\",\"http://dbpedia.org/ontology/board\",\"http://dbpedia.org/ontology/bodyDiscovered\",\"http://dbpedia.org/ontology/bodyDiscovered\",\"http://dbpedia.org/ontology/bowlRecord\",\"http://dbpedia.org/ontology/bpnId\",\"http://dbpedia.org/ontology/buriedPlace\",\"http://dbpedia.org/ontology/buriedPlace\",\"http://dbpedia.org/ontology/bustSize\",\"http://dbpedia.org/ontology/callSign\",\"http://dbpedia.org/ontology/canonizedBy\",\"http://dbpedia.org/ontology/canonizedBy\",\"http://dbpedia.org/ontology/canonizedDate\",\"http://dbpedia.org/ontology/canonizedPlace\",\"http://dbpedia.org/ontology/canonizedPlace\",\"http://dbpedia.org/ontology/careerPoints\",\"http://dbpedia.org/ontology/careerPrizeMoney\",\"http://dbpedia.org/ontology/careerStation\",\"http://dbpedia.org/ontology/careerStation\",\"http://dbpedia.org/ontology/centuryBreaks\",\"http://dbpedia.org/ontology/chairman\",\"http://dbpedia.org/ontology/chairman\",\"http://dbpedia.org/ontology/championships\",\"http://dbpedia.org/ontology/chancellor\",\"http://dbpedia.org/ontology/chancellor\",\"http://dbpedia.org/ontology/child\",\"http://dbpedia.org/ontology/child\",\"http://dbpedia.org/ontology/choreographer\",\"http://dbpedia.org/ontology/choreographer\",\"http://dbpedia.org/ontology/citizenship\",\"http://dbpedia.org/ontology/citizenship\",\"http://dbpedia.org/ontology/club\",\"http://dbpedia.org/ontology/club\",\"http://dbpedia.org/ontology/coach\",\"http://dbpedia.org/ontology/coach\",\"http://dbpedia.org/ontology/coachedTeam\",\"http://dbpedia.org/ontology/coachedTeam\",\"http://dbpedia.org/ontology/coachingRecord\",\"http://dbpedia.org/ontology/college\",\"http://dbpedia.org/ontology/college\",\"http://dbpedia.org/ontology/committee\",\"http://dbpedia.org/ontology/commonName\",\"http://dbpedia.org/ontology/configuration\",\"http://dbpedia.org/ontology/country\",\"http://dbpedia.org/ontology/country\",\"http://dbpedia.org/ontology/creator\",\"http://dbpedia.org/ontology/creator\",\"http://dbpedia.org/ontology/currentMember\",\"http://dbpedia.org/ontology/currentMember\",\"http://dbpedia.org/ontology/currentPartner\",\"http://dbpedia.org/ontology/currentPartner\",\"http://dbpedia.org/ontology/currentRank\",\"http://dbpedia.org/ontology/currentRecord\",\"http://dbpedia.org/ontology/dateOfBurial\",\"http://dbpedia.org/ontology/deathCause\",\"http://dbpedia.org/ontology/deathCause\",\"http://dbpedia.org/ontology/deathDate\",\"http://dbpedia.org/ontology/deathPlace\",\"http://dbpedia.org/ontology/deathPlace\",\"http://dbpedia.org/ontology/deathYear\",\"http://dbpedia.org/ontology/debut\",\"http://dbpedia.org/ontology/debutTeam\",\"http://dbpedia.org/ontology/debutTeam\",\"http://dbpedia.org/ontology/depictionDescription\",\"http://dbpedia.org/ontology/deputy\",\"http://dbpedia.org/ontology/deputy\",\"http://dbpedia.org/ontology/discipline\",\"http://dbpedia.org/ontology/discipline\",\"http://dbpedia.org/ontology/doctoralAdvisor\",\"http://dbpedia.org/ontology/doctoralAdvisor\",\"http://dbpedia.org/ontology/doctoralStudent\",\"http://dbpedia.org/ontology/doctoralStudent\",\"http://dbpedia.org/ontology/draft\",\"http://dbpedia.org/ontology/draftPick\",\"http://dbpedia.org/ontology/draftRound\",\"http://dbpedia.org/ontology/draftTeam\",\"http://dbpedia.org/ontology/draftTeam\",\"http://dbpedia.org/ontology/draftYear\",\"http://dbpedia.org/ontology/education\",\"http://dbpedia.org/ontology/education\",\"http://dbpedia.org/ontology/electionDate\",\"http://dbpedia.org/ontology/electionMajority\",\"http://dbpedia.org/ontology/employer\",\"http://dbpedia.org/ontology/employer\",\"http://dbpedia.org/ontology/endYearOfInsertion\",\"http://dbpedia.org/ontology/engine\",\"http://dbpedia.org/ontology/engine\",\"http://dbpedia.org/ontology/era\",\"http://dbpedia.org/ontology/era\",\"http://dbpedia.org/ontology/espnId\",\"http://dbpedia.org/ontology/ethnicity\",\"http://dbpedia.org/ontology/ethnicity\",\"http://dbpedia.org/ontology/event\",\"http://dbpedia.org/ontology/event\",\"http://dbpedia.org/ontology/eyeColor\",\"http://dbpedia.org/ontology/fastestLap\",\"http://dbpedia.org/ontology/feastDay\",\"http://dbpedia.org/ontology/field\",\"http://dbpedia.org/ontology/field\",\"http://dbpedia.org/ontology/firstAppearance\",\"http://dbpedia.org/ontology/firstRace\",\"http://dbpedia.org/ontology/firstRace\",\"http://dbpedia.org/ontology/firstWin\",\"http://dbpedia.org/ontology/firstWin\",\"http://dbpedia.org/ontology/format\",\"http://dbpedia.org/ontology/format\",\"http://dbpedia.org/ontology/formerChoreographer\",\"http://dbpedia.org/ontology/formerChoreographer\",\"http://dbpedia.org/ontology/formerCoach\",\"http://dbpedia.org/ontology/formerCoach\",\"http://dbpedia.org/ontology/formerHighschool\",\"http://dbpedia.org/ontology/formerHighschool\",\"http://dbpedia.org/ontology/formerPartner\",\"http://dbpedia.org/ontology/formerPartner\",\"http://dbpedia.org/ontology/formerTeam\",\"http://dbpedia.org/ontology/formerTeam\",\"http://dbpedia.org/ontology/garrison\",\"http://dbpedia.org/ontology/garrison\",\"http://dbpedia.org/ontology/gender\",\"http://dbpedia.org/ontology/gender\",\"http://dbpedia.org/ontology/genre\",\"http://dbpedia.org/ontology/genre\",\"http://dbpedia.org/ontology/governor\",\"http://dbpedia.org/ontology/governor\",\"http://dbpedia.org/ontology/governorGeneral\",\"http://dbpedia.org/ontology/governorGeneral\",\"http://dbpedia.org/ontology/hairColor\",\"http://dbpedia.org/ontology/hallOfFame\",\"http://dbpedia.org/ontology/height\",\"http://dbpedia.org/ontology/heir\",\"http://dbpedia.org/ontology/heir\",\"http://dbpedia.org/ontology/highestBreak\",\"http://dbpedia.org/ontology/highestRank\",\"http://dbpedia.org/ontology/highschool\",\"http://dbpedia.org/ontology/highschool\",\"http://dbpedia.org/ontology/hipSize\",\"http://dbpedia.org/ontology/hometown\",\"http://dbpedia.org/ontology/hometown\",\"http://dbpedia.org/ontology/honours\",\"http://dbpedia.org/ontology/honours\",\"http://dbpedia.org/ontology/incumbent\",\"http://dbpedia.org/ontology/incumbent\",\"http://dbpedia.org/ontology/individualisedGnd\",\"http://dbpedia.org/ontology/influenced\",\"http://dbpedia.org/ontology/influenced\",\"http://dbpedia.org/ontology/influencedBy\",\"http://dbpedia.org/ontology/influencedBy\",\"http://dbpedia.org/ontology/institution\",\"http://dbpedia.org/ontology/institution\",\"http://dbpedia.org/ontology/instrument\",\"http://dbpedia.org/ontology/instrument\",\"http://dbpedia.org/ontology/isniId\",\"http://dbpedia.org/ontology/knownFor\",\"http://dbpedia.org/ontology/knownFor\",\"http://dbpedia.org/ontology/language\",\"http://dbpedia.org/ontology/language\",\"http://dbpedia.org/ontology/lastAppearance\",\"http://dbpedia.org/ontology/lastAppearance\",\"http://dbpedia.org/ontology/lastPosition\",\"http://dbpedia.org/ontology/lastRace\",\"http://dbpedia.org/ontology/lastRace\",\"http://dbpedia.org/ontology/lastWin\",\"http://dbpedia.org/ontology/lastWin\",\"http://dbpedia.org/ontology/lccnId\",\"http://dbpedia.org/ontology/league\",\"http://dbpedia.org/ontology/league\",\"http://dbpedia.org/ontology/length\",\"http://dbpedia.org/ontology/lieutenant\",\"http://dbpedia.org/ontology/lieutenant\",\"http://dbpedia.org/ontology/location\",\"http://dbpedia.org/ontology/location\",\"http://dbpedia.org/ontology/mainInterest\",\"http://dbpedia.org/ontology/mainInterest\",\"http://dbpedia.org/ontology/majorShrine\",\"http://dbpedia.org/ontology/majorShrine\",\"http://dbpedia.org/ontology/majorityLeader\",\"http://dbpedia.org/ontology/manager\",\"http://dbpedia.org/ontology/manager\",\"http://dbpedia.org/ontology/managerClub\",\"http://dbpedia.org/ontology/managerClub\",\"http://dbpedia.org/ontology/manufacturer\",\"http://dbpedia.org/ontology/manufacturer\",\"http://dbpedia.org/ontology/mbaId\",\"http://dbpedia.org/ontology/militaryBranch\",\"http://dbpedia.org/ontology/militaryBranch\",\"http://dbpedia.org/ontology/militaryCommand\",\"http://dbpedia.org/ontology/militaryRank\",\"http://dbpedia.org/ontology/militaryRank\",\"http://dbpedia.org/ontology/militaryUnit\",\"http://dbpedia.org/ontology/militaryUnit\",\"http://dbpedia.org/ontology/mission\",\"http://dbpedia.org/ontology/mission\",\"http://dbpedia.org/ontology/monarch\",\"http://dbpedia.org/ontology/monarch\",\"http://dbpedia.org/ontology/movement\",\"http://dbpedia.org/ontology/movement\",\"http://dbpedia.org/ontology/mythology\",\"http://dbpedia.org/ontology/mythology\",\"http://dbpedia.org/ontology/nationalTeam\",\"http://dbpedia.org/ontology/nationalTeam\",\"http://dbpedia.org/ontology/nationality\",\"http://dbpedia.org/ontology/nationality\",\"http://dbpedia.org/ontology/network\",\"http://dbpedia.org/ontology/network\",\"http://dbpedia.org/ontology/networth\",\"http://dbpedia.org/ontology/nlaId\",\"http://dbpedia.org/ontology/nominee\",\"http://dbpedia.org/ontology/nominee\",\"http://dbpedia.org/ontology/notableIdea\",\"http://dbpedia.org/ontology/notableIdea\",\"http://dbpedia.org/ontology/notableStudent\",\"http://dbpedia.org/ontology/notableStudent\",\"http://dbpedia.org/ontology/notableWork\",\"http://dbpedia.org/ontology/notableWork\",\"http://dbpedia.org/ontology/number\",\"http://dbpedia.org/ontology/numberBuilt\",\"http://dbpedia.org/ontology/numberOfEpisodes\",\"http://dbpedia.org/ontology/numberOfFilms\",\"http://dbpedia.org/ontology/occupation\",\"http://dbpedia.org/ontology/occupation\",\"http://dbpedia.org/ontology/office\",\"http://dbpedia.org/ontology/opponent\",\"http://dbpedia.org/ontology/opponent\",\"http://dbpedia.org/ontology/orcidId\",\"http://dbpedia.org/ontology/orderInOffice\",\"http://dbpedia.org/ontology/origin\",\"http://dbpedia.org/ontology/origin\",\"http://dbpedia.org/ontology/otherParty\",\"http://dbpedia.org/ontology/otherParty\",\"http://dbpedia.org/ontology/otherWins\",\"http://dbpedia.org/ontology/overallRecord\",\"http://dbpedia.org/ontology/parent\",\"http://dbpedia.org/ontology/parent\",\"http://dbpedia.org/ontology/partner\",\"http://dbpedia.org/ontology/partner\",\"http://dbpedia.org/ontology/party\",\"http://dbpedia.org/ontology/party\",\"http://dbpedia.org/ontology/personFunction\",\"http://dbpedia.org/ontology/personFunction\",\"http://dbpedia.org/ontology/philosophicalSchool\",\"http://dbpedia.org/ontology/philosophicalSchool\",\"http://dbpedia.org/ontology/picture\",\"http://dbpedia.org/ontology/picture\",\"http://dbpedia.org/ontology/placeOfBurial\",\"http://dbpedia.org/ontology/placeOfBurial\",\"http://dbpedia.org/ontology/plays\",\"http://dbpedia.org/ontology/podiums\",\"http://dbpedia.org/ontology/poles\",\"http://dbpedia.org/ontology/portfolio\",\"http://dbpedia.org/ontology/portrayer\",\"http://dbpedia.org/ontology/portrayer\",\"http://dbpedia.org/ontology/position\",\"http://dbpedia.org/ontology/position\",\"http://dbpedia.org/ontology/predecessor\",\"http://dbpedia.org/ontology/predecessor\",\"http://dbpedia.org/ontology/presenter\",\"http://dbpedia.org/ontology/presenter\",\"http://dbpedia.org/ontology/president\",\"http://dbpedia.org/ontology/president\",\"http://dbpedia.org/ontology/previousWork\",\"http://dbpedia.org/ontology/previousWork\",\"http://dbpedia.org/ontology/priceMoney\",\"http://dbpedia.org/ontology/primeMinister\",\"http://dbpedia.org/ontology/primeMinister\",\"http://dbpedia.org/ontology/producer\",\"http://dbpedia.org/ontology/producer\",\"http://dbpedia.org/ontology/productionEndYear\",\"http://dbpedia.org/ontology/productionStartYear\",\"http://dbpedia.org/ontology/profession\",\"http://dbpedia.org/ontology/profession\",\"http://dbpedia.org/ontology/prospectLeague\",\"http://dbpedia.org/ontology/prospectLeague\",\"http://dbpedia.org/ontology/prospectTeam\",\"http://dbpedia.org/ontology/prospectTeam\",\"http://dbpedia.org/ontology/pseudonym\",\"http://dbpedia.org/ontology/publisher\",\"http://dbpedia.org/ontology/publisher\",\"http://dbpedia.org/ontology/race\",\"http://dbpedia.org/ontology/race\",\"http://dbpedia.org/ontology/raceHorse\",\"http://dbpedia.org/ontology/raceHorse\",\"http://dbpedia.org/ontology/races\",\"http://dbpedia.org/ontology/rankingWins\",\"http://dbpedia.org/ontology/rating\",\"http://dbpedia.org/ontology/recordDate\",\"http://dbpedia.org/ontology/recordLabel\",\"http://dbpedia.org/ontology/recordLabel\",\"http://dbpedia.org/ontology/recordedIn\",\"http://dbpedia.org/ontology/recordedIn\",\"http://dbpedia.org/ontology/region\",\"http://dbpedia.org/ontology/region\",\"http://dbpedia.org/ontology/relation\",\"http://dbpedia.org/ontology/relation\",\"http://dbpedia.org/ontology/relative\",\"http://dbpedia.org/ontology/relative\",\"http://dbpedia.org/ontology/releaseDate\",\"http://dbpedia.org/ontology/religion\",\"http://dbpedia.org/ontology/religion\",\"http://dbpedia.org/ontology/residence\",\"http://dbpedia.org/ontology/residence\",\"http://dbpedia.org/ontology/restingPlace\",\"http://dbpedia.org/ontology/restingPlace\",\"http://dbpedia.org/ontology/restingPlacePosition\",\"http://dbpedia.org/ontology/ridId\",\"http://dbpedia.org/ontology/royalAnthem\",\"http://dbpedia.org/ontology/royalAnthem\",\"http://dbpedia.org/ontology/runningMate\",\"http://dbpedia.org/ontology/runningMate\",\"http://dbpedia.org/ontology/runtime\",\"http://dbpedia.org/ontology/salary\",\"http://dbpedia.org/ontology/school\",\"http://dbpedia.org/ontology/school\",\"http://dbpedia.org/ontology/selection\",\"http://dbpedia.org/ontology/selection\",\"http://dbpedia.org/ontology/selibrId\",\"http://dbpedia.org/ontology/seniority\",\"http://dbpedia.org/ontology/series\",\"http://dbpedia.org/ontology/series\",\"http://dbpedia.org/ontology/serviceEndYear\",\"http://dbpedia.org/ontology/serviceNumber\",\"http://dbpedia.org/ontology/serviceStartYear\",\"http://dbpedia.org/ontology/shoeNumber\",\"http://dbpedia.org/ontology/shoots\",\"http://dbpedia.org/ontology/significantBuilding\",\"http://dbpedia.org/ontology/significantBuilding\",\"http://dbpedia.org/ontology/significantDesign\",\"http://dbpedia.org/ontology/significantDesign\",\"http://dbpedia.org/ontology/significantProject\",\"http://dbpedia.org/ontology/significantProject\",\"http://dbpedia.org/ontology/speaker\",\"http://dbpedia.org/ontology/species\",\"http://dbpedia.org/ontology/species\",\"http://dbpedia.org/ontology/spike\",\"http://dbpedia.org/ontology/sport\",\"http://dbpedia.org/ontology/sport\",\"http://dbpedia.org/ontology/sportCountry\",\"http://dbpedia.org/ontology/sportCountry\",\"http://dbpedia.org/ontology/spouse\",\"http://dbpedia.org/ontology/spouse\",\"http://dbpedia.org/ontology/squadNumber\",\"http://dbpedia.org/ontology/starring\",\"http://dbpedia.org/ontology/starring\",\"http://dbpedia.org/ontology/startYearOfInsertion\",\"http://dbpedia.org/ontology/stateDelegate\",\"http://dbpedia.org/ontology/stateDelegate\",\"http://dbpedia.org/ontology/stateOfOrigin\",\"http://dbpedia.org/ontology/stateOfOrigin\",\"http://dbpedia.org/ontology/statisticLabel\",\"http://dbpedia.org/ontology/statisticLabel\",\"http://dbpedia.org/ontology/statisticValue\",\"http://dbpedia.org/ontology/statisticYear\",\"http://dbpedia.org/ontology/status\",\"http://dbpedia.org/ontology/subsequentWork\",\"http://dbpedia.org/ontology/subsequentWork\",\"http://dbpedia.org/ontology/successor\",\"http://dbpedia.org/ontology/successor\",\"http://dbpedia.org/ontology/supplementalDraftRound\",\"http://dbpedia.org/ontology/supplementalDraftYear\",\"http://dbpedia.org/ontology/suppreddedDate\",\"http://dbpedia.org/ontology/team\",\"http://dbpedia.org/ontology/team\",\"http://dbpedia.org/ontology/televisionSeries\",\"http://dbpedia.org/ontology/televisionSeries\",\"http://dbpedia.org/ontology/termPeriod\",\"http://dbpedia.org/ontology/termPeriod\",\"http://dbpedia.org/ontology/throwingSide\",\"http://dbpedia.org/ontology/thumbnail\",\"http://dbpedia.org/ontology/thumbnail\",\"http://dbpedia.org/ontology/timeInSpace\",\"http://dbpedia.org/ontology/title\",\"http://dbpedia.org/ontology/tournamentRecord\",\"http://dbpedia.org/ontology/trackNumber\",\"http://dbpedia.org/ontology/trainer\",\"http://dbpedia.org/ontology/trainer\",\"http://dbpedia.org/ontology/training\",\"http://dbpedia.org/ontology/training\",\"http://dbpedia.org/ontology/transmission\",\"http://dbpedia.org/ontology/type\",\"http://dbpedia.org/ontology/type\",\"http://dbpedia.org/ontology/ulanId\",\"http://dbpedia.org/ontology/undraftedYear\",\"http://dbpedia.org/ontology/unitCost\",\"http://dbpedia.org/ontology/university\",\"http://dbpedia.org/ontology/university\",\"http://dbpedia.org/ontology/veneratedIn\",\"http://dbpedia.org/ontology/veneratedIn\",\"http://dbpedia.org/ontology/viafId\",\"http://dbpedia.org/ontology/vicePresident\",\"http://dbpedia.org/ontology/vicePresident\",\"http://dbpedia.org/ontology/vicePrimeMinister\",\"http://dbpedia.org/ontology/vicePrimeMinister\",\"http://dbpedia.org/ontology/voice\",\"http://dbpedia.org/ontology/voice\",\"http://dbpedia.org/ontology/voiceType\",\"http://dbpedia.org/ontology/voiceType\",\"http://dbpedia.org/ontology/waistSize\",\"http://dbpedia.org/ontology/weight\",\"http://dbpedia.org/ontology/whaDraft\",\"http://dbpedia.org/ontology/whaDraftTeam\",\"http://dbpedia.org/ontology/whaDraftTeam\",\"http://dbpedia.org/ontology/wheelbase\",\"http://dbpedia.org/ontology/width\",\"http://dbpedia.org/ontology/wikiPageDisambiguates\",\"http://dbpedia.org/ontology/wikiPageDisambiguates\",\"http://dbpedia.org/ontology/wikiPageID\",\"http://dbpedia.org/ontology/wikiPageRedirects\",\"http://dbpedia.org/ontology/wikiPageRedirects\",\"http://dbpedia.org/ontology/wikiPageRevisionID\",\"http://dbpedia.org/ontology/wins\",\"http://dbpedia.org/ontology/winsAtAsia\",\"http://dbpedia.org/ontology/winsAtAsia\",\"http://dbpedia.org/ontology/winsAtAus\",\"http://dbpedia.org/ontology/winsAtAus\",\"http://dbpedia.org/ontology/winsAtChallenges\",\"http://dbpedia.org/ontology/winsAtChallenges\",\"http://dbpedia.org/ontology/winsAtChampionships\",\"http://dbpedia.org/ontology/winsAtChampionships\",\"http://dbpedia.org/ontology/winsAtJapan\",\"http://dbpedia.org/ontology/winsAtJapan\",\"http://dbpedia.org/ontology/winsAtLET\",\"http://dbpedia.org/ontology/winsAtLET\",\"http://dbpedia.org/ontology/winsAtLPGA\",\"http://dbpedia.org/ontology/winsAtLPGA\",\"http://dbpedia.org/ontology/winsAtMajors\",\"http://dbpedia.org/ontology/winsAtMajors\",\"http://dbpedia.org/ontology/winsAtNWIDE\",\"http://dbpedia.org/ontology/winsAtNWIDE\",\"http://dbpedia.org/ontology/winsAtOtherTournaments\",\"http://dbpedia.org/ontology/winsAtOtherTournaments\",\"http://dbpedia.org/ontology/winsAtPGA\",\"http://dbpedia.org/ontology/winsAtPGA\",\"http://dbpedia.org/ontology/winsAtProTournaments\",\"http://dbpedia.org/ontology/winsAtProTournaments\",\"http://dbpedia.org/ontology/winsInEurope\",\"http://dbpedia.org/ontology/winsInEurope\",\"http://dbpedia.org/ontology/worldChampionTitleYear\",\"http://dbpedia.org/ontology/writer\",\"http://dbpedia.org/ontology/writer\",\"http://dbpedia.org/ontology/year\",\"http://dbpedia.org/ontology/years\",\"http://purl.org/dc/elements/1.1/description\",\"http://www.georss.org/georss/point\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#type\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#type\",\"http://www.w3.org/2002/07/owl#sameAs\",\"http://www.w3.org/2002/07/owl#sameAs\",\"http://www.w3.org/2003/01/geo/wgs84_pos#lat\",\"http://www.w3.org/2003/01/geo/wgs84_pos#long\",\"http://www.w3.org/2004/02/skos/core#broader\",\"http://www.w3.org/2004/02/skos/core#broader\",\"http://www.w3.org/2004/02/skos/core#prefLabel\",\"http://www.w3.org/2004/02/skos/core#related\",\"http://www.w3.org/2004/02/skos/core#related\",\"http://www.w3.org/2004/02/skos/core#subject\",\"http://www.w3.org/2004/02/skos/core#subject\"\r\n", "\"URI\",\"rdf-schema#Literal\",\"rdf-schema#Literal\",\"minute\",\"millimetre\",\"millimetre\",\"millimetre\",\"centimetre\",\"kilogram\",\"minute\",\"XMLSchema#string\",\"Person\",\"XMLSchema#date\",\"XMLSchema#gYear\",\"XMLSchema#date\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"Album\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Agent\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Artist\",\"XMLSchema#string\",\"Band\",\"XMLSchema#string\",\"MusicalArtist\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"XMLSchema#string\",\"Award\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"MilitaryConflict\",\"XMLSchema#string\",\"Person\",\"XMLSchema#date\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#date\",\"22-rdf-syntax-ns#langString\",\"XMLSchema#string\",\"Place\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"Place\",\"XMLSchema#double\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#date\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#integer\",\"XMLSchema#double\",\"XMLSchema#string\",\"CareerStation\",\"XMLSchema#integer\",\"XMLSchema#string\",\"Person\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"22-rdf-syntax-ns#langString\",\"engineConfiguration\",\"XMLSchema#string\",\"Country\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#integer\",\"XMLSchema#string\",\"XMLSchema#date\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#date\",\"XMLSchema#string\",\"Place\",\"XMLSchema#gYear\",\"XMLSchema#date\",\"XMLSchema#string\",\"SportsTeam\",\"22-rdf-syntax-ns#langString\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#date\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"Organisation\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"AutomobileEngine\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#integer\",\"XMLSchema#string\",\"EthnicGroup\",\"XMLSchema#string\",\"Event\",\"XMLSchema#string\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#date\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"GrandPrix\",\"XMLSchema#string\",\"GrandPrix\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Genre\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#double\",\"XMLSchema#string\",\"Person\",\"XMLSchema#integer\",\"XMLSchema#integer\",\"XMLSchema#string\",\"School\",\"XMLSchema#double\",\"XMLSchema#string\",\"Settlement\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Organisation\",\"XMLSchema#string\",\"Instrument\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Language\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#positiveInteger\",\"XMLSchema#string\",\"GrandPrix\",\"XMLSchema#string\",\"GrandPrix\",\"XMLSchema#string\",\"XMLSchema#string\",\"SportsLeague\",\"XMLSchema#double\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Place\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#integer\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"Organisation\",\"XMLSchema#string\",\"XMLSchema#string\",\"MilitaryUnit\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"MilitaryUnit\",\"XMLSchema#string\",\"SpaceMission\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"Country\",\"XMLSchema#string\",\"Broadcaster\",\"XMLSchema#double\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Work\",\"XMLSchema#integer\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"PersonFunction\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#string\",\"PoliticalParty\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"PoliticalParty\",\"XMLSchema#string\",\"PersonFunction\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#string\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Work\",\"XMLSchema#double\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Agent\",\"XMLSchema#gYear\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"SportsLeague\",\"XMLSchema#string\",\"HockeyTeam\",\"22-rdf-syntax-ns#langString\",\"XMLSchema#string\",\"Company\",\"XMLSchema#string\",\"Race\",\"XMLSchema#string\",\"RaceHorse\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#float\",\"XMLSchema#date\",\"XMLSchema#string\",\"RecordLabel\",\"XMLSchema#string\",\"PopulatedPlace\",\"XMLSchema#string\",\"Place\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#date\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Place\",\"XMLSchema#string\",\"Place\",\"wgs84_pos#SpatialThing\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"Person\",\"XMLSchema#double\",\"XMLSchema#double\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"XMLSchema#gYear\",\"XMLSchema#positiveInteger\",\"XMLSchema#string\",\"XMLSchema#string\",\"Building\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#integer\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"Sport\",\"XMLSchema#string\",\"Country\",\"XMLSchema#string\",\"Person\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"Actor\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Country\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#float\",\"XMLSchema#date\",\"XMLSchema#string\",\"XMLSchema#string\",\"Work\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#gYear\",\"XMLSchema#date\",\"XMLSchema#string\",\"SportsTeam\",\"XMLSchema#string\",\"TelevisionShow\",\"XMLSchema#string\",\"TimePeriod\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#double\",\"22-rdf-syntax-ns#langString\",\"XMLSchema#string\",\"XMLSchema#positiveInteger\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#gYear\",\"XMLSchema#double\",\"XMLSchema#string\",\"EducationalInstitution\",\"XMLSchema#string\",\"Organisation\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"Person\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#double\",\"XMLSchema#double\",\"XMLSchema#string\",\"XMLSchema#string\",\"HockeyTeam\",\"XMLSchema#double\",\"XMLSchema#double\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#integer\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#integer\",\"XMLSchema#nonNegativeInteger\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"Person\",\"XMLSchema#gYear\",\"XMLSchema#gYear\",\"XMLSchema#string\",\"XMLSchema#string\",\"XMLSchema#string\",\"rdf-schema#Class\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#float\",\"XMLSchema#float\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"XMLSchema#string\",\"owl#Thing\",\"XMLSchema#string\",\"owl#Thing\"\r\n", "\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2000/01/rdf-schema#Literal\",\"http://www.w3.org/2000/01/rdf-schema#Literal\",\"http://dbpedia.org/datatype/minute\",\"http://dbpedia.org/datatype/millimetre\",\"http://dbpedia.org/datatype/millimetre\",\"http://dbpedia.org/datatype/millimetre\",\"http://dbpedia.org/datatype/centimetre\",\"http://dbpedia.org/datatype/kilogram\",\"http://dbpedia.org/datatype/minute\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Album\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Agent\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Artist\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Band\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/MusicalArtist\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Award\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/MilitaryConflict\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#langString\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/CareerStation\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#langString\",\"http://dbpedia.org/datatype/engineConfiguration\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Country\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#langString\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Organisation\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/AutomobileEngine\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EthnicGroup\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Event\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/GrandPrix\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/GrandPrix\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Genre\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/School\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Settlement\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Organisation\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Instrument\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Language\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#positiveInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/GrandPrix\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/GrandPrix\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsLeague\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Organisation\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/MilitaryUnit\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/MilitaryUnit\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SpaceMission\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Country\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Broadcaster\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Work\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PersonFunction\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PoliticalParty\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PoliticalParty\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PersonFunction\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Work\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Agent\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsLeague\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/HockeyTeam\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#langString\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Company\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Race\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/RaceHorse\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#float\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/RecordLabel\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/PopulatedPlace\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Place\",\"http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#positiveInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Building\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Sport\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Country\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Actor\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Country\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#float\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Work\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#date\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/SportsTeam\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/TelevisionShow\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/TimePeriod\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/1999/02/22-rdf-syntax-ns#langString\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#positiveInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/EducationalInstitution\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Organisation\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/HockeyTeam\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#double\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#integer\",\"http://www.w3.org/2001/XMLSchema#nonNegativeInteger\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://dbpedia.org/ontology/Person\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#gYear\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2000/01/rdf-schema#Class\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#float\",\"http://www.w3.org/2001/XMLSchema#float\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\",\"http://www.w3.org/2001/XMLSchema#string\",\"http://www.w3.org/2002/07/owl#Thing\"\r\n", "\"http://dbpedia.org/resource/Ingeborg_Moen_Borgerud\",\"Ingeborg Moen Borgerud\",\"Ingeborg Moen Borgerud (born 23 September 1949) is a Norwegian lawyer businessperson and former politician for the Labour Party.\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"1949-09-23\",\"NULL\",\"NULL\",\"NULL\",\"1949-01-01T00:00:00+02:00\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"68355056\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"29651641\",\"NULL\",\"NULL\",\"605026708\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"Norwegian businesswoman and politician\",\"NULL\",\"{agent|person|owl#Thing|Person|Q215627|Q5|DUL.owl#Agent|DUL.owl#NaturalPerson|LabourParty(Norway)Politicians|LivingPeople|Adult109605289|CausalAgent100007347|Head110162991|Lawyer110249950|Leader109623038|LivingThing100004258|Object100002684|Organism100004475|Person100007846|Politician110451263|Professional110480253|Secretary110570019|Whole100003553|YagoLegalActorGeo|PhysicalEntity100001930|NorwegianLawyers|NorwegianStateRailwaysPeople|NorwegianStateSecretaries|PeopleFromPorsgrunn|YagoLegalActor}\",\"{http://dbpedia.org/ontology/Agent|http://dbpedia.org/ontology/Person|http://www.w3.org/2002/07/owl#Thing|http://schema.org/Person|http://xmlns.com/foaf/0.1/Person|http://wikidata.dbpedia.org/resource/Q215627|http://wikidata.dbpedia.org/resource/Q5|http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent|http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#NaturalPerson|http://dbpedia.org/class/yago/LabourParty(Norway)Politicians|http://dbpedia.org/class/yago/LivingPeople|http://dbpedia.org/class/yago/Adult109605289|http://dbpedia.org/class/yago/CausalAgent100007347|http://dbpedia.org/class/yago/Head110162991|http://dbpedia.org/class/yago/Lawyer110249950|http://dbpedia.org/class/yago/Leader109623038|http://dbpedia.org/class/yago/LivingThing100004258|http://dbpedia.org/class/yago/Object100002684|http://dbpedia.org/class/yago/Organism100004475|http://dbpedia.org/class/yago/Person100007846|http://dbpedia.org/class/yago/Politician110451263|http://dbpedia.org/class/yago/Professional110480253|http://dbpedia.org/class/yago/Secretary110570019|http://dbpedia.org/class/yago/Whole100003553|http://dbpedia.org/class/yago/YagoLegalActorGeo|http://dbpedia.org/class/yago/PhysicalEntity100001930|http://dbpedia.org/class/yago/NorwegianLawyers|http://dbpedia.org/class/yago/NorwegianStateRailwaysPeople|http://dbpedia.org/class/yago/NorwegianStateSecretaries|http://dbpedia.org/class/yago/PeopleFromPorsgrunn|http://dbpedia.org/class/yago/YagoLegalActor}\",\"{m.0fpk0fl|Q6032049|Ingeborg_Moen_Borgerud}\",\"{http://rdf.freebase.com/ns/m.0fpk0fl|http://wikidata.dbpedia.org/resource/Q6032049|http://wikidata.org/entity/Q6032049|http://yago-knowledge.org/resource/Ingeborg_Moen_Borgerud}\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\",\"NULL\"\r\n" ] } ], "source": [ "!head -5 Person_10K.csv" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.22 s, sys: 206 ms, total: 1.42 s\n", "Wall time: 2.48 s\n" ] } ], "source": [ "# Import the spreadsheet into pandas\n", "# we manually inspected the csv to set the parameters\n", "%time Persons10K= pd.read_csv('Person_10K.csv', sep=',', header=0, skiprows=[1,2,3], index_col=0, na_values='NULL', low_memory=False)\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
rdf-schema#labelrdf-schema#commenttimeInSpacewheelbaselengthwidthheightweightruntimeacademicAdvisor_label...owl#sameAswgs84_pos#latwgs84_pos#longcore#broader_labelcore#broadercore#prefLabelcore#related_labelcore#relatedcore#subject_labelcore#subject
URI
http://dbpedia.org/resource/Ingeborg_Moen_BorgerudIngeborg Moen BorgerudIngeborg Moen Borgerud (born 23 September 1949...NaNNaNNaNNaNNaNNaNNaNNaN...{http://rdf.freebase.com/ns/m.0fpk0fl|http://w...NaNNaNNaNNaNNaNNaNNaNNaNNaN
http://dbpedia.org/resource/Ingeborg_NilssonIngeborg NilssonIngeborg Nilsson (1924 – 1995) was a Norwegian...NaNNaNNaNNaNNaNNaNNaNNaN...{http://rdf.freebase.com/ns/m.0gk_y5m|http://d...NaNNaNNaNNaNNaNNaNNaNNaNNaN
http://dbpedia.org/resource/Ingeborg_NorellIngeborg NorellIngeborg Norell (born 1727) was the first Finn...NaNNaNNaNNaNNaNNaNNaNNaN...{http://wikidata.org/entity/Q14324600|http://r...NaNNaNNaNNaNNaNNaNNaNNaNNaN
http://dbpedia.org/resource/Ingeborg_PehrsonIngeborg PehrsonIngeborg Pehrson (16 December 1886 - 11 April ...NaNNaNNaNNaNNaNNaNNaNNaN...{http://rdf.freebase.com/ns/m.04n5lbm|http://w...NaNNaNNaNNaNNaNNaNNaNNaNNaN
http://dbpedia.org/resource/Ingeborg_Pf%C3%BCllerIngeborg PfüllerIngeborg Ella Pfüller (born January 1 1932) wa...NaNNaNNaNNaNNaNNaNNaNNaN...{http://rdf.freebase.com/ns/m.04zx11p|http://y...NaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

5 rows × 513 columns

\n", "
" ], "text/plain": [ " rdf-schema#label \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud Ingeborg Moen Borgerud \n", "http://dbpedia.org/resource/Ingeborg_Nilsson Ingeborg Nilsson \n", "http://dbpedia.org/resource/Ingeborg_Norell Ingeborg Norell \n", "http://dbpedia.org/resource/Ingeborg_Pehrson Ingeborg Pehrson \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller Ingeborg Pfüller \n", "\n", " rdf-schema#comment \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud Ingeborg Moen Borgerud (born 23 September 1949... \n", "http://dbpedia.org/resource/Ingeborg_Nilsson Ingeborg Nilsson (1924 – 1995) was a Norwegian... \n", "http://dbpedia.org/resource/Ingeborg_Norell Ingeborg Norell (born 1727) was the first Finn... \n", "http://dbpedia.org/resource/Ingeborg_Pehrson Ingeborg Pehrson (16 December 1886 - 11 April ... \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller Ingeborg Ella Pfüller (born January 1 1932) wa... \n", "\n", " timeInSpace wheelbase \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN NaN \n", "\n", " length width height \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN NaN NaN \n", "\n", " weight runtime \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN NaN \n", "\n", " academicAdvisor_label \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " ... \\\n", "URI ... \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud ... \n", "http://dbpedia.org/resource/Ingeborg_Nilsson ... \n", "http://dbpedia.org/resource/Ingeborg_Norell ... \n", "http://dbpedia.org/resource/Ingeborg_Pehrson ... \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller ... \n", "\n", " owl#sameAs \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud {http://rdf.freebase.com/ns/m.0fpk0fl|http://w... \n", "http://dbpedia.org/resource/Ingeborg_Nilsson {http://rdf.freebase.com/ns/m.0gk_y5m|http://d... \n", "http://dbpedia.org/resource/Ingeborg_Norell {http://wikidata.org/entity/Q14324600|http://r... \n", "http://dbpedia.org/resource/Ingeborg_Pehrson {http://rdf.freebase.com/ns/m.04n5lbm|http://w... \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller {http://rdf.freebase.com/ns/m.04zx11p|http://y... \n", "\n", " wgs84_pos#lat \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " wgs84_pos#long \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#broader_label \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#broader \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#prefLabel \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#related_label \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#related \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#subject_label \\\n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", " core#subject \n", "URI \n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud NaN \n", "http://dbpedia.org/resource/Ingeborg_Nilsson NaN \n", "http://dbpedia.org/resource/Ingeborg_Norell NaN \n", "http://dbpedia.org/resource/Ingeborg_Pehrson NaN \n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller NaN \n", "\n", "[5 rows x 513 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# just checking\n", "Persons10K.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 3 first explorations\n", "\n", "1. What are those labels anyway?\n", "2. Which labels are used a lot?\n", "3. Average and median number of attribute-value pairs in an infobox\n", "4. Can we shrink the data?" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(204, 204)" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# percentage which has a spouse: 2%\n", "len(Persons10K.spouse.dropna()), len(Persons10K.spouse_label.dropna())" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# remove the columns with \"label\"\n", "withoutlabelcolumns = [c for c in Persons10K.columns if not \"_label\" in c]\n", "P10K=Persons10K[withoutlabelcolumns]\n", "\n", "# Count the number of non null values in each column\n", "# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.count.html\n", "\n", "a=P10K.count(axis=0)\n", "a.sort(ascending=False) " ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "22-rdf-syntax-ns#type 9996\n", "owl#sameAs 7523\n", "rdf-schema#label 7480\n", "wikiPageID 7480\n", "wikiPageRevisionID 7480\n", "rdf-schema#comment 7465\n", "description 6735\n", "birthYear 6343\n", "birthDate 6334\n", "birthPlace 3582\n", "deathYear 3065\n", "deathDate 3033\n", "thumbnail 2206\n", "team 1866\n", "viafId 1733\n", "deathPlace 1132\n", "height.1 953\n", "height 868\n", "occupation 856\n", "position 812\n", "dtype: int64" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.head(20)" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "URI\n", "http://dbpedia.org/resource/Isaac_Stevens 34\n", "http://dbpedia.org/resource/Isabel_la_Negra 34\n", "http://dbpedia.org/resource/Ion_Antonescu 31\n", "http://dbpedia.org/resource/Isaac_Shelby 31\n", "http://dbpedia.org/resource/Iskander_Mirza 31\n", "http://dbpedia.org/resource/Isaac_Asimov 28\n", "http://dbpedia.org/resource/Ira_Joy_Chase 28\n", "http://dbpedia.org/resource/Isaac_R._Sherwood 28\n", "http://dbpedia.org/resource/Isaac_Newton 27\n", "http://dbpedia.org/resource/Ioannis_Metaxas 27\n", "dtype: int64" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a=P10K.count(axis=1)\n", "a.sort(ascending=False) \n", "a.head(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "% matplotlib inline\n", "\n", "a.plot(kind='hist');" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(10.065126050420169, 11.0, 5.72774509087427)" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.mean(), a.median(), a.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Further cleaning\n", "\n", "* let's remove all persons without a description or without a birthdate\n", "* This looks far more reasonable." ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html\n", "P10K_birthdate= P10K.dropna(axis='index', subset=['birthDate'])" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "6334" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(P10K_birthdate)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(13.511209346384591, 13.0, 3.4915861542848923)" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAEACAYAAABPiSrXAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEx1JREFUeJzt3X+sZOVdx/H3h9JGaEkoUpefhlW32jUYCBaqrXKNKdkm\nBtAaWowGldQaaqlEE9mauDcmalvTpm1M8Y/SAtWiq40IllKWhrFVI5tWKNtut0DSNd0VFqv9AZro\nYr/+MWdhenvv9rm798y5M/f9SiaceeacOd/DMzufe55zzpxUFZIktThh6AIkSbPD0JAkNTM0JEnN\nDA1JUjNDQ5LUzNCQJDXrLTSSnJvk/iSfT/K5JNd37YtJDiR5sHu8ZmKZ7UkeTbIvyWUT7Rcl2dO9\n9p6+apYkHV36uk4jyRnAGVX1UJIXAZ8BrgSuAp6qqnctmX8r8GHg5cDZwH3AlqqqJLuB36iq3Unu\nBt5bVff0UrgkaUW97WlU1RNV9VA3/TTwBcZhAJBlFrkCuL2qDlfVfuAx4JIkZwKnVNXubr7bGIeP\nJGnKpnJMI8l5wIXAP3dNb07y2SQ3Jzm1azsLODCx2AHGIbO0/SDPhY8kaYp6D41uaOqvgbd0exw3\nAZuBC4DHgXf2XYMkaW2c2OebJ3k+8BHgz6rqDoCqenLi9fcDd3VPDwLnTix+DuM9jIPd9GT7wWXW\n5Y9oSdIxqKrlDhksq8+zpwLcDOytqndPtJ85MdvPAnu66TuB1yd5QZLNwBZgd1U9AXwjySXde/4S\ncMdy66yquX3s2LFj8BrcPrdvo23bRti+1epzT+OVwC8CDyd5sGt7K3B1kguAAr4EvBGgqvYm2Qns\nBZ4Brqvntug64BbgJODu8swpSRpEb6FRVf/A8nsyHzvKMn8I/OEy7Z8Bzl+76iRJx8IrwmfEwsLC\n0CX0yu2bXfO8bTD/27davV3cN21Jal62RZKmJQm1Hg6ES5Lmj6EhSWpmaEiSmhkakqRmhoYkqZmh\nIUlqZmhIkpoZGpKkZoaGJKmZoSFJamZoSJKaGRqSpGa93rlP0zW+R9Ww/NFIab4ZGnNnyC/t4UNL\nUr8cnpIkNTM0JEnNDA1JUjNDQ5LUzNCQJDUzNCRJzQwNSVIzQ0OS1MzQkCQ1MzQkSc0MDUlSM0ND\nktTM0JAkNTM0JEnNDA1JUjNDQ5LUzNCQJDUzNCRJzQwNSVIzQ0OS1Ky30EhybpL7k3w+yeeSXN+1\nn5ZkV5JHktyb5NSJZbYneTTJviSXTbRflGRP99p7+qpZknR0fe5pHAZuqKofBl4BvCnJy4AbgV1V\n9VLgE91zkmwFXgdsBbYB70uS7r1uAq6tqi3AliTbeqxbkrSC3kKjqp6oqoe66aeBLwBnA5cDt3az\n3Qpc2U1fAdxeVYeraj/wGHBJkjOBU6pqdzffbRPLSJKmaCrHNJKcB1wIPABsqqpD3UuHgE3d9FnA\ngYnFDjAOmaXtB7t2SdKUndj3CpK8CPgI8Jaqeuq5ESeoqkpSa7WuxcXFZ6cXFhZYWFhYq7eWpLkw\nGo0YjUbHvHyq1uw7+9vfPHk+8HfAx6rq3V3bPmChqp7ohp7ur6ofSnIjQFW9rZvvHmAH8K/dPC/r\n2q8GLq2qX1+yrupzW2bBOJCH/H8QNnofSLMmCVWV7zznWJ9nTwW4Gdh7JDA6dwLXdNPXAHdMtL8+\nyQuSbAa2ALur6gngG0ku6d7zlyaWkSRNUW97GkleBXwSeJjn/vzdDuwGdgLfC+wHrqqqr3XLvBX4\nVeAZxsNZH+/aLwJuAU4C7q6q65dZn3sa7mlIWqXV7mn0Ojw1TYaGoSFp9dbN8JQkaf4YGpKkZoaG\nJKmZoSFJamZoSJKaGRqSpGaGhiSpmaEhSWpmaEiSmhkakqRmhoYkqZmhIUlqZmhIkpoZGpKkZoaG\nJKmZoSFJamZoSJKaGRqSpGaGhiSpmaEhSWpmaEiSmhkakqRmhoYkqZmhIUlqZmhIkpoZGpKkZoaG\nJKmZoSFJamZoSJKaGRqSpGaGhiSpmaEhSWpmaEiSmhkakqRmJw5dgOZLkqFLoKqGLkGaW73uaST5\nQJJDSfZMtC0mOZDkwe7xmonXtid5NMm+JJdNtF+UZE/32nv6rFnHqwZ+SOpT38NTHwS2LWkr4F1V\ndWH3+BhAkq3A64Ct3TLvy3N/tt4EXFtVW4AtSZa+pyRpCnoNjar6FPDVZV5abgzjCuD2qjpcVfuB\nx4BLkpwJnFJVu7v5bgOu7KNeSdLRDXUg/M1JPpvk5iSndm1nAQcm5jkAnL1M+8GuXZI0ZUOExk3A\nZuAC4HHgnQPUIEk6BlM/e6qqnjwyneT9wF3d04PAuROznsN4D+NgNz3ZfnC5915cXHx2emFhgYWF\nhbUoWZLmxmg0YjQaHfPy6fv0xCTnAXdV1fnd8zOr6vFu+gbg5VX1C92B8A8DFzMefroP+IGqqiQP\nANcDu4GPAu+tqnuWrKc2+qmW4/MGhvx/MPT6xzVs9M+BtBpJqKrmc+V73dNIcjtwKXB6ki8DO4CF\nJBcw/nb5EvBGgKram2QnsBd4BrhuIgWuA24BTgLuXhoYkqTp6H1PY1rc03BP40gNG/1zIK3Gavc0\n/BkRSVIzQ0OS1Ow7hkaSM7rrKe7pnm9Ncm3/pUmS1puWPY1bgHsZX2QH8ChwQ18FSZLWr5bQOL2q\n/hL4P4CqOsz47CZJ0gbTEhpPJ/nuI0+SvAL4en8lSZLWq5brNH6L8VXb35fkn4CXAD/fa1WSpHWp\n6TqNJCcCP8h4z2RfN0S1rnidhtdpHKlho38OpNVY8+s0krwQ2A78ZlXtAc5L8jPHUaMkaUa1HNP4\nIPC/wI93z/8N+IPeKpIkrVstofH9VfV2xsFBVf1XvyVJktarltD4nyQnHXmS5PuB/+mvJEnSetVy\n9tQicA9wTpIPA68EfrnHmiRJ69RRQyPJCcCLgdcCr+ia31JV/953YZKk9ec7nnKb5DNVddGU6jlm\nnnLrKbdHatjonwNpNVZ7ym1LaLwN+Arwl8CzB8Gr6j+Ptcg+rIfQGH9pD83QGPpzIM2SPkJjP8t8\nE1TV5lVX16P1Exob+Ut76PWPaxj6cyDNkjUPjVlhaMDwX9pDr39cw9CfA2mWrPk9wpO8lm//Jvg6\nsKeqnlxlfZKkGdZyyu2vAj8G3M/4T8lLgX8BNif5/aq6rcf6JEnrSEtoPB94WVUdAkiyCfgQcAnw\nScDQkKQNouWK8HOPBEbnya7tP+h+WkSStDG07Gncn+SjwE7Gw1OvBUbdr99+rc/iJEnrS8sptycA\nP8f450MA/hH4yOCnKi3h2VMw/NlLQ69/XMPQnwNplqz52VNV9c0knwa+XlW7kpwMvAh46jjqlCTN\noJabMP0a8FfAn3ZN5wB39FmUJGl9ajkQ/ibgVcA3AKrqEeB7+ixKkrQ+Nd1Po6qevX9Gd79wB40l\naQNqCY2/T/K7wMlJXs14qOqufsuSJK1HLWdPPQ+4Frisa/o48P7BT1VawrOnYPizl4Ze/7iGoT8H\n0izp5QcLk3wPwHr+rSlDA4b/0h56/eMahv4cSLNktaGx4vBUxhaTfAX4IvDFJF9JsiPr48YRkqQp\nO9oxjRsYX9D38qp6cVW9GLi4a7thGsVJktaXFYenkjwEvHrp/cCTvATYVVUXTKG+Zg5PwfDDQ0Ov\nf1zD0J8DaZas2fAUcOLSwADo2lp+s0qSNGeOFhqHj/E1SdKcOlpo/EiSp5Z7AOe3vHmSDyQ5lGTP\nRNtpSXYleSTJvUlOnXhte5JHk+xLctlE+0VJ9nSvvedYNlSSdPxWDI2qel5VnbLCo3V46oPAtiVt\nNzI+JvJS4BPdc5JsBV4HbO2Wed/EWVo3AddW1RZgS5Kl7ylJmoKWK8KPWVV9CvjqkubLgVu76VuB\nK7vpK4Dbq+pwVe0HHgMuSXImcEpV7e7mu21iGUnSFPUaGivYNHEnwEPApm76LODAxHwHgLOXaT/Y\ntUuSpmyI0HhWd46s50dK0owY4tTZQ0nOqKonuqGnIz9NchA4d2K+cxjvYRzspifbDy73xouLi89O\nLywssLCwsHZVS9IcGI1GjEajY16+6benjkeS84C7qur87vk7gP+oqrcnuRE4tapu7A6Ef5jxVedn\nA/cBP1BVleQB4HpgN/BR4L1Vdc+S9Xhx3+AX1w29/nENQ38OpFmy5rd7Pc5ibgcuBU5P8mXg94C3\nATuTXAvsB64CqKq9SXYCe4FngOsmUuA64BbgJODupYEhSZqO3vc0psU9DRj+L/2h1z+uYejPgTRL\n1vJnRCRJ+haGhiSpmaEhSWpmaEiSmhkakqRmhoYkqZmhIUlqZmhIkpoZGpKkZoaGJKmZoSFJamZo\nSJKaGRqSpGaGhiSpmaEhSWo2xO1epV6N72syHO/noXlmaGgODX0jKml+OTwlSWpmaEiSmhkakqRm\nhoYkqZmhIUlqZmhIkpoZGpKkZoaGJKmZoSFJamZoSJKaGRqSpGaGhiSpmaEhSWpmaEiSmhkakqRm\nhoYkqZmhIUlqZmhIkpoZGpKkZoOFRpL9SR5O8mCS3V3baUl2JXkkyb1JTp2Yf3uSR5PsS3LZUHVL\n0kY25J5GAQtVdWFVXdy13QjsqqqXAp/onpNkK/A6YCuwDXhfEveSJGnKhv7izZLnlwO3dtO3Ald2\n01cAt1fV4araDzwGXIwkaaqG3tO4L8mnk7yha9tUVYe66UPApm76LODAxLIHgLOnU6Yk6YgTB1z3\nK6vq8SQvAXYl2Tf5YlVVkjrK8t/22uLi4rPTCwsLLCwsrFGpkjQfRqMRo9HomJdP1dG+l6cjyQ7g\naeANjI9zPJHkTOD+qvqhJDcCVNXbuvnvAXZU1QMT71FDb0sSlsmyaVawwde/HmoIQ38OpdVIQlUt\nPVSwokGGp5KcnOSUbvqFwGXAHuBO4JputmuAO7rpO4HXJ3lBks3AFmD3dKuWJA01PLUJ+JvxX+ac\nCPx5Vd2b5NPAziTXAvuBqwCqam+SncBe4BngusF3KyRpA1oXw1NrweEpWA9DMw5POTyl2TITw1OS\npNlkaEiSmhkakqRmhoYkqZmhIUlqZmhIkpoZGpKkZoaGJKmZoSFJamZoSJKaGRqSpGaGhiSpmaEh\nSWo25J37pLnU/eT/YPyVXfXJ0JDW3NA/Ty/1x+EpSVIzQ0OS1MzQkCQ1MzQkSc0MDUlSM0NDktTM\n0JAkNTM0JEnNDA1JUjNDQ5LUzNCQJDUzNCRJzQwNSVIzQ0OS1MzQkCQ1MzQkSc0MDUlSM0NDktTM\n271Kc2boe5SD9ymfZ4aGNHeG/sIePrTUn5kZnkqyLcm+JI8m+Z2h65GkjWgmQiPJ84A/AbYBW4Gr\nk7xs2KqmbTR0AT0bDV1Az0ZDF9Cj0dAF9Go0Gg1dwroyE6EBXAw8VlX7q+ow8BfAFQPXNGWjoQvo\n2WjoAno2GrqAHo2GLqBXhsa3mpXQOBv48sTzA12bJGmKZuVAeNORvfVw1oik4f8tevZWfzIL/3OT\nvAJYrKpt3fPtwDer6u0T86z/DZGkdaiqmlN+VkLjROCLwE8D/wbsBq6uqi8MWpgkbTAzMTxVVc8k\n+Q3g48DzgJsNDEmavpnY05AkrQ+zcvbUUSXZn+ThJA8m2T10PccryQeSHEqyZ6LttCS7kjyS5N4k\npw5Z47FaYdsWkxzo+u/BJNuGrPF4JDk3yf1JPp/kc0mu79rnpf9W2r656MMk35XkgSQPJdmb5I+6\n9pnvv6Ns26r6bi72NJJ8Cbioqv5z6FrWQpKfAJ4Gbquq87u2dwBfqap3dFfEv7iqbhyyzmOxwrbt\nAJ6qqncNWtwaSHIGcEZVPZTkRcBngCuBX2E++m+l7buK+enDk6vqv7tjqf8A/DZwOfPRf8tt20+z\nir6biz2Nztycb1tVnwK+uqT5cuDWbvpWxv9QZ84K2wZz0n9V9URVPdRNPw18gfE1RfPSfyttH8xP\nH/53N/kCxsdQv8r89N9y2war6Lt5CY0C7kvy6SRvGLqYnmyqqkPd9CFg05DF9ODNST6b5OZZ3PVf\nTpLzgAuBB5jD/pvYvn/umuaiD5OckOQhxv10f1V9njnpvxW2DVbRd/MSGq+sqguB1wBv6oZA5laN\nxxRnf1zxOTcBm4ELgMeBdw5bzvHrhm4+Arylqp6afG0e+q/bvr9mvH1PM0d9WFXfrKoLgHOAn0zy\nU0ten9n+W2bbFlhl381FaFTV491//x34G8a/VTVvDnXjySQ5E3hy4HrWTFU9WR3g/cx4/yV5PuPA\n+FBV3dE1z03/TWzfnx3ZvnnrQ4Cq+jrwUeAi5qj/4Fu27UdX23czHxpJTk5ySjf9QuAyYM/Rl5pJ\ndwLXdNPXAHccZd6Z0v0jPOJnmeH+y/j3M24G9lbVuydemov+W2n75qUPk5x+ZHgmyUnAq4EHmYP+\nW2nbjoRh5zv23cyfPZVkM+O9CxhfrPjnVfVHA5Z03JLcDlwKnM547PH3gL8FdgLfC+wHrqqqrw1V\n47FaZtt2AAuMd40L+BLwxonx45mS5FXAJ4GHeW4IYzvjXzGYh/5bbvveClzNHPRhkvMZH+g+oXt8\nqKr+OMlpzHj/HWXbbmMVfTfzoSFJmp6ZH56SJE2PoSFJamZoSJKaGRqSpGaGhiSpmaEhSWpmaEiS\nmhkakqRm/w9nuYIgCCnvaAAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "a=P10K_birthdate.count(axis=1)\n", "a.sort(ascending=False) \n", "a.plot(kind='hist');\n", "a.mean(), a.median(), a.std()" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "260.914634146 2.0 1071.05119648\n" ] }, { "data": { "text/plain": [ "birthDate 6334\n", "22-rdf-syntax-ns#type 6334\n", "owl#sameAs 6331\n", "wikiPageID 6331\n", "wikiPageRevisionID 6331\n", "rdf-schema#label 6331\n", "rdf-schema#comment 6319\n", "birthYear 6287\n", "description 5891\n", "birthPlace 3446\n", "deathDate 2868\n", "deathYear 2858\n", "thumbnail 2006\n", "viafId 1670\n", "deathPlace 1083\n", "height.1 947\n", "height 864\n", "team 828\n", "position 804\n", "occupation 760\n", "dtype: int64" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a=P10K_birthdate.count(axis=0)\n", "a.sort(ascending=False) \n", "print a.mean(), a.median(), a.std()\n", "a.head(20)" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZQAAAEDCAYAAAASpvJbAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmYFNW9//H312EVAXeDXHCMgIFAUFZFDaMYHRdERAXi\nFtBo9KpoTPQSo07i9lNvjBo1Jk8iURNBEhUhSlDUiduNyiK7AaIoGhRwwwVZz++PUxPaoXumZ6a7\nT1fX5/U8/UxXdfXpb3XX9LfPUnXMOYeIiEhT7RA6ABERKQ1KKCIikhNKKCIikhNKKCIikhNKKCIi\nkhNKKCIikhNKKCIikhNKKCIikhMlm1DMrI2ZvWpmx4WORUQkCUo2oQCXAw+FDkJEJClik1DM7F4z\ne9/MFtRaX2lmr5vZMjO7Ilr3HWAxsCZErCIiSWRxuZaXmR0GfAbc75zrFa0rA/4JHAm8C7wKjAZO\nA9oAPYD1wHAXlx0VEYmpZqEDyJZz7nkzK6+1egCw3Dm3AsDMJgHDnHM/jZbPAtYomYiI5F9sEkoG\nHYGVKcvvAANrFpxz9xU8IhGRhIp7QmlSzcPMVHMREWkE55zVXhebTvkM3gU6pSx3wtdSsuacK9rb\nNddcU9RlN6aMbJ+TzXb1bVPX45key+d7XiyfWz7Lz+cxkYvjQsdEbsrPJO4JZRbQ1czKzawFMBKY\n2pACqqqqqK6uzkdsTVZRUVHUZTemjGyfk8129W1T1+P5fG/zKd9xN7X8fB4T2W7b2M9dx0T95VdX\nV1NVVZVx2ziN8poIDAZ2A1YDVzvnJpjZMcBtQBnwe+fcjQ0o08Vl/6Uwqqqq6vyHkeTRMbE9M8Ol\nafKKTR+Kc250hvXTgekFDkdKVFx/pUr+6JjIXtybvJqsmJu8pPD05SG16ZjYpmSavPJBTV4iIg2X\nqckr8TUUERHJjcQnFDV5iYhkR01edVCTl4hIw8V+lFe+HHBA9tvadm9fbrZV2em3LSuDZs22/1tz\nv00b6N0b+veHAw/0yyISTuJrKHPnZrf/DXmbGvqWquztbd3qb5s3+9uWLV/9u3kzfPIJzJ0Ls2bB\nwoXw9a/75NKvn78deCC0aNGw1xWR+qmGksGUKVVUVFRoaGDMbdwICxb45DJrFtx9t08w06aFjkyk\ndFRXV9fZ55z4GkqS97+UbdgA3/gG3H8/HHZY6GhESouGDUuitGwJVVXwk580vLlNRBpHCUVK1umn\nwwcfwIwZoSMRSYbEJxSdh1K6ysrg2mt9LWXr1tDRiMSfzkOpg/pQSp9zfuTXFVfAKaeEjkakNGTq\nQ1FCSfD+J8WMGTBunB9a3Czx4xpFmk6d8pJYRx0Fe+0FDzwQOhKR0qYaSoL3P0lefBG++11YutSP\nABORxlMNRRLtkEOgVy/47W9DRyJSuhKfUDTKKzmuuw5uuAE+/zx0JCLxpFFedVCTV/KMGuUvKDl+\nfOhIROJLo7zSUEJJnqVLffPX0qWwyy6hoxGJJ/WhiADdusHIkTB0KLz9duhoREqLEookzh13wAkn\n+BMeH300dDQipUNNXgne/6R7+WUYPRoqK+EXv4DWrUNHJBIPavISqWXgQD9B14cf+vuLF4eOSCTe\nEn8hiqoqTbCVZO3bw8SJMGECDB7saystW351quE99oD/+R9dtkVEE2zVQU1ekmrZMn9Gfc0UwzW3\niRPhjDPgwgtDRyhSHDRsOA0lFMnGwoVwxBG+SWz33UNHIxKeEkoaSiiSrXHj/LTC99wTOhKR8JRQ\n0lBCkWx99BF07w7Tp8OBB4aORiQsjfISaYJddvGzP150keaoF8lECUUkS2PHwvr1vpNeRLanJq8E\n77803Esvwamnwuuvw047hY5GJAw1eYnkwKBBfsTX9deHjkSk+KiGkuD9l8ZZtQq+9S0/pXBlZeho\nRApPNZQMNMGWNFSHDjB1Kpx5JkyZEjoakcLRBFt1UA1FmmL2bDjuOLjtNj9xl0hSZKqh6OpEIo3U\nty/MnAlHH+1Hf40ZEzoikbCUUESaoGdPePZZOPJIf37K2LGhIxIJR01eCd5/yZ2FC2HIEFi5Elq0\nCB2NSH6pU14kj3r29LdHHgkdiUg4SigiOXL++bp4pCSbmrwSvP+SW5s2wT77+I76Hj1CRyOSP2ry\nEsmz5s3h7LPhN78JHYlIGKqhJHj/Jffefttf3v7tt6FNm9DRiOSHaigiBdC5MxxyCDz0UOhIRAqv\nJBOKmX3DzH5tZpPN7OzQ8Uiy/OAH8Otfh45CpPBKusnLzHYAJjnnTs3wuJq8JOe2bIEuXeDPf4Z+\n/UJHI5J7sW/yMrN7zex9M1tQa32lmb1uZsvM7IqU9UOBx4FJhY5Vkq2sDM49V0OIJXliU0Mxs8OA\nz4D7nXO9onVlwD+BI4F3gVeB0c65JSnPe8w5NyxDmaqhSF6sXu2HDt91F4wcGToakdyK/cUhnXPP\nm1l5rdUDgOXOuRUAZjYJGGZmewInAa2AZwsYpggAe+7pz0c58USYP9/PR79DbNoDRBonNgklg47A\nypTld4CBzrm/A3/PpoDUa/tXVFRQUVGRw/AkyQ44AF59FU4+2SeWP/4R2rULHZVIw1VXV2c1b1Rs\nmrwAohrKtJQmrxFApXPu+9Hy6fiEclGW5anJS/Ju0yYYNw6eecaf9Dh4cOiIRJom9p3yGbwLdEpZ\n7oSvpYgUjebN4e67/Tz0Z5wB3/0u/PvfoaMSyb24J5RZQFczKzezFsBIYGpDCtAUwFIoI0bAkiWw\n775+TvqrroI5c/w8KiJxUDJTAJvZRGAwsBuwGrjaOTfBzI4BbgPKgN87525sQJlq8pIgli3zJz8+\n/jh8+ikceyx8+9tw0EHQtSvYdo0JIsUjU5NXbBJKPiihSDFYtgymT4eXXoJ//APWrfMJZtw46N8/\ndHQi2yvVPpQmU5OXhNa1K1x8MUyaBCtWwOLF/gKTp5zirws2c2boCEW8kmnyygfVUKSYbd4MU6bA\nRRdBVRWcd17oiEQ8NXmloYQicbB8OVRWwujR8POfq39FwlOTVwZq8pJi16WL71+ZMQPGjPHntYiE\noCavOqiGInHy+ecwahRs3Ah/+Qu0bRs6Ikkq1VBEYq5NG3j0USgv92fbv/de6IhEvkoJRSRGmjXz\nl8UfOhSOPBLWrg0dkcg2cb84ZJNVVVXpopASK2Z+1NeGDb6z/umnoX370FFJEtR3kUj1oSR4/yXe\nnIMLL/SXbxkxAlq39mfb9+oVOjIpdRo2nIYSisTd1q1w553w1luwapW/VtjcuaGjklKnhJKGEoqU\nks2bYa+9/IReHTuGjkZKmUZ5ZaDzUKRUNGsGRx/tLzgpkg86D6UOqqFIqfnTn2DyZHjssdCRSClT\nk1caSihSaj74wM+3sno1tGoVOhopVWryEkmA3Xbzk3epFVdCUEIRKTHHHad+FAlDCUWkxBx/vE8o\nas2VQkt8QtEoLyk1PXv6s+mffTZ0JFJqNMqrDuqUl1I1dSr8+Mcwb5465yX31CkvkiAnnOAvwXLD\nDaEjkSRRDSXB+y+l7d//ht694ZlndH0vyS3VUEQSZu+94fbbYcgQeOSR0NFIEqiGkuD9l2R45RU/\n0+Oxx8Ktt0KLFqEjkrhTDUUkoQYM8FcgfvttOOooWLMmdERSqhKfUDRsWJKgfXs/ffCgQdCvH0yf\nHjoiiSMNG66DmrwkiZ58Ei64wJ+vcsQRsN9+/lZe7pvDzPxNJBNdHDINJRRJqvXrYcIEWLQI/vUv\nf3vrLT+nStu2cPPNcO65SiySnhJKGkooIttbtAjGjoWddoLf/c5fvVgklTrlRSQr3/wmvPiin6yr\nf38/x4pINlRDSfD+i9Rn0SI/Muzmm+G000JHI8VCTV5pKKGI1G/xYn9y5FlnweWXw667ho5IQlOT\nl4g0So8e8Oqr8OGH/v4//hE6IilWqqEkeP9FGuqJJ3xN5bLLoHNnP9y4Vy/YccfQkUkhZaqhNAsR\nTDGpqqqioqKCioqK0KGIFL1jj4W//hUeeMCffb98Oaxb5y/vsssuoaOTfKuurq7zRHDVUBK8/yK5\ncMklsHSpPxO/ZcvQ0UghqA9FRPLillt8k1fHjn4ellGj4IUXQkclIaiGkuD9F8mlt96C2bNh9Wq4\n/nq46ip/tr2UHg0bTkMJRSQ//vY3P1vkc8+FjkTyQQklDSUUkfxYvx722svXWtRZX3rUhyIiBdO6\nNRx2GDz1VOhIpJCUUEQkL449VvOuJI0SiojkxdCh/pyVJ54IHYkUihKKiORF584wbZq/FP5ll8FH\nH4WOSPJNCUVE8uagg+C11+DTT+GAA2DKFD+sWEqTRnkleP9FCunxx+Gmm2DBAjj7bD+suEWL0FFJ\nYyRq2LCZDQOOA9oBv3fOpR1rooQiUnhr1sCpp/p5VsaPDx2NNEaiEkoNM9sZ+F/n3DkZHldCEQng\njTdgwACYN89fskXiJfbnoZjZvWb2vpktqLW+0sxeN7NlZnZFraf9FLizcFGKSDa+/nU4/3y4+GLQ\nb7rSEZuEAkwAKlNXmFkZPmFUAj2A0WbW3bybgOnOudcKH6qI1OfKK/1skLffDhs3ho5GciE2CcU5\n9zxQe+DhAGC5c26Fc24TMAkYBlwIDAFONrPzChupiGSjVSuYPNkPLS4vh6oqWLUqdFTSFHGfYKsj\nsDJl+R1goHPuIuBX2RRQVVX1n/uaaEuksHr1gqefhkWL4K67/PKMGdC3b+jIJFV9E2vViFWnvJmV\nA9Occ72i5RFApXPu+9Hy6WxLKNmUp055kSLy8MO+X+Xhh/05LFKcSnUK4HeBTinLnfC1FBGJoREj\noFkzP1HXPvv4M+xPPRV2iE3jfLLF/WOaBXQ1s3IzawGMBKY2pICqqqqsqnIiUhjDhsE778C118Iv\nfgEDB8LataGjEvBNX6ndBLXFpsnLzCYCg4HdgNXA1c65CWZ2DHAbUIY/ifHGBpSpJi+RIrZ1K5xz\nDnTo4GeBlOKQyBMb66OEIlL83nwT+vWDZctg111DRyNQAic25ouavESK2777wqhR8JOfhI5ESqbJ\nKx9UQxGJh48/9kOKzzkHrr4abLvfxlJIqqGISGztvDO8/DI8+CDMnBk6Gskk8QlFTV4i8bD33nDF\nFXDjjf5ESCk8NXnVQU1eIvGyYYPvT/m//4Mzz/QXmOzQwV/GRQpHTV4iEnstW8Kjj/rL3q9ZA/37\nw5gxoaOSGqqhJHj/ReJu+XIYPNifCKmO+sJRDSUD9aGIxNd++8GmTbByZf3bStOpD6UOqqGIxN/w\n4TBypO9bkcJQDUVEStKgQfDSS6GjEFBCEZGYU0IpHolPKOpDEYm3vn1hyRL4/PPQkZQ+9aHUQX0o\nIqXh4IPhuutgyJDQkSSD+lBEpGSdcw5ccgl8+GHoSJJNCUVEYm/sWDj+eOjSBU48EdTwEIYSiojE\nnpm/xtf8+bBiBUye7CfnksJKfEJRp7xI6fiv//JTB19wgb/Ol+SWOuXroE55kdK0bh307Omv/dWn\nD0yapEuz5JKmAE5DCUWkdH3yCbz3Hpx2GvTo4S/T8qMfQZs2oSOLPyWUNJRQRErf8uX+CsUvvwzv\nvw+PPw7t2oWOKt6UUNJQQhFJjq1b4eyzoXlz+O1vQ0cTb0ooaSihiCTL2rXQrZuf8bFDh9DRxJdO\nbBSRxNt9dzjmGJg6NXQkpSnxCUXDhkWS5fjj4U9/8pNyScNo2HAd1OQlkjzr1vlLtbzyClx4IXz7\n2zBgQOio4kV9KGkooYgk1/33w+zZ/hyVWbOgU6fQEcWHEkoaSigicv31MHOmv5WVhY4mHpRQ0lBC\nEZEtW+CII/xw4gce0OivbGiUl4hIGmVl8Nhj/gz6iRNDRxNvSigikng77wznngu33gpPPx06mvhS\nk1eC919EtnEOxo2DVavgz38OHU1xUx9KGkooIpJq1SrYe2/fQa/phDNTH4qISD06dIDbboNRo/z5\nKtIwiU8oOlNeRFKNGweVlfDLX2oq4dp0pnwd1OQlIuksXQq9e8Oll8INN4SOpvioDyUNJRQRyeSt\nt/xsjytWQNu2oaMpLkooaSihiEhdTj4Z9toLxoyBfv1CR1M81CkvItJA117rO+ePPhrWrw8dTfFT\nQhERyaB7d385lkMPhX32gfvuCx1RcVOTV4L3X0Sys2GDn4v++uv9FYqTTn0oaSihiEi2tmyBjh3h\nhRegS5fQ0YSlPhQRkSYoK/Od9JMnh46keCmhiIhkafhwf2ViSU9NXgnefxFpmI0b/cyONeelHH00\n3HVX2JhCSFQfipntC1wJtHfOnVLHdkooItIgH37ob+vWweGHw9q1fnKuJElUH4pz7k3n3Dmh4xCR\n0rPrrr5Tvk8f6NbNX5147tzQURWH2CQUM7vXzN43swW11lea2etmtszMrggVn4gkz3PP+Y76J58M\nHUlxiE1CASYAlakrzKwMuDNa3wMYbWbdA8QmIgnUujV85ztw880wfnzoaMKLTUJxzj0PfFRr9QBg\nuXNuhXNuEzAJGGZmu5rZPcABqrWISD4NGwaTJsH994eOJLxmoQNooo7AypTld4CBzrkPgR+ECUlE\nkqSsDI480p9NP368n5/+ssugWdy/XRsh7rvc5CFaqZPFVFRUUFFR0dQiRSRhzODuu2HxYrjjDj9B\nV+/eoaPKnerq6qwmIozVsGEzKwemOed6RcsHAVXOucpoeTyw1Tl3U5bladiwiOTUyJFwwglw2mmh\nI8mfUh02PAvoamblZtYCGAlMDRyTiCRYz56wcGHoKMKITUIxs4nAS0A3M1tpZmOcc5uBC4EZwGLg\nIefckoaUqznlRSSXevaERYtCR5EfmlO+DmryEpFcW7rU96G88UboSPKnVJu8mkw1FBHJpf32g1Wr\n4PPPQ0eSe6qh1EE1FBHJhx494KGHoFev0JHkh2ooIiIF0qULLF8eOorCU0IREcmx7t3hvPNg//1h\n3rzQ0RRO4hOK+lBEJNd+9jN48UXf9PXaa6GjyR31odRBfSgikk9XXgmtWsFVV4WOJLfUhyIiUmCd\nO8OSJfDll6EjKYzEJxQ1eYlIvvTpA089BZdfHjqS3FCTVx3U5CUi+fboo3DffTBlSuhIckdNXiIi\nAXzta/5ExyRQQhERyaMOHZKTUNTkleD9F5H827AB2raFo47yy507+7lT4ixTk1fcJ9hqsqqqKk2s\nJSJ507IlPPMMfPwxbN0KI0bAXXf5Sbnipr6JtlRDSfD+i0jhtWsHK1dC+/ahI2k8dcqLiBSB3XaD\ntWtDR5EfSigiIgW0++7wwQeho8iPxPehiIgU0m67wd/+BqtX++WBA2GPPcLGlCuJr6HoTHkRKaST\nToJXXoF77oHLLovXiC+dKV8HdcqLSEi33grvvOP/xok65UVEiky7drBuXegockcJRUQkECUUERHJ\nCSUUERHJiVJLKBo2LCISSLt28O678PDDfrlZMzjuOP83jmIadu7oWl4iEso++8CgQfDgg375mWdg\n5kzo2zdsXJnoWl510LBhESkmhx0GN9zg/xYzDRsWESlyrVvDF1+EjqLxlFBERIpE69awfn3oKBpP\nCUVEpEjsuKMSioiI5ICavEREJCfU5CUiIjmhJi8REcmJuNdQEn9io4hIsWjdGp54AjKdHtepE4wd\nW9iYGiLxNRRNsCUixWL4cDjiCNi6dfvbZ5/B5ZeHjU8TbNVBZ8qLSFx8/LG/VMsnn4SORGfKi4jE\nWvPmsGlT6CjqpoQiIhIDzZopoYiISA40bw6bN2fusC8GSigiIjGwww7+tmVL6EgyU0IREYmJYu9H\nUUIREYkJJRQREcmJYu+YV0IREYmJmo75YqWEIiISE8Xe5FWS1/IyszbA3cAGoNo592DgkEREmqzY\nE0qp1lBOAiY7584FTggdjIhILqgPJUfM7F4ze9/MFtRaX2lmr5vZMjO7IlrdEVgZ3S/iUdtSbHSh\nUKmtmI4J9aHkzgSgMnWFmZUBd0brewCjzaw78A7QKdosTvsogRXTl4cUh2I6JtTklSPOueeBj2qt\nHgAsd86tcM5tAiYBw4BHgBFmdjcwtbCR5k4+D+RclN2YMrJ9Tjbb1bdNXY8X05dEQ+Q77qaWn89j\nItttG/u5x+GYSJdQiumYiE1CySC1aQt8zaSjc+4L59xY59wFzrmJgWJrMiWUpm0T9y+PdIrpyyNX\nz1dCyV7LlnDJJTB06LbbuedWM3QoPPdcfl6zIe9LrOZDMbNyYJpzrle0PAKodM59P1o+HRjonLso\ny/Lis/MiIkUk3XwocR82/C7b+kqI7r+T7ZPTvSEiItI4cW/ymgV0NbNyM2sBjCTGfSYiInEWm4Ri\nZhOBl4BuZrbSzMY45zYDFwIzgMXAQ865JSHjFBFJqjoTipl1MrNnzWyRmS00s4uj9beY2RIzm2dm\nj5hZ+4a+cFSrWJCyPDEqb1y67Z1zo51zezvnWjrnOjnnJkTrpzvn9nfOdXHO3Zjhtdqb2fkNjTHX\nzOw8M/uemfU2s3tS1g+LhjtLlsysyswua8TzBpvZwSnLf4j64mpvV25m681sjpktNrOXzeysLMrv\nbWbHNDQuaZx8HwfRY93M7AkzW2pms83sITPbsylxl6r6aiibgEudc98EDgL+O/riexL4pnOuN7AU\nGF/fC0XnjGR67GtAP+dcb+fc7VlHn71dgAvyUG5DHQo8B1REf2sMx59HI9lr7ICKw4FBWZaz3DnX\nxznXAxgFXGJm36un/AOBYxsZmzRcXo8DM2sF/BW4yznXzTnXF39Zpz0a+bqlzTmX9Q2YAgyptW44\n8McM21cDvwReBS4F+gLzgNeAW4AF0XbzgS+AucChtco4BVgQPac6Wvd3oHfKNi8A3wKqgHuBZ4F/\nARdFj09KKf8moA0wE5gdvfYJ0Xb9o/haRtssBHqk2a8/ALcDL0avMyJa3wGfKOZGMR8Srb80WvdZ\n9PcjYBH+wDwY+AB4A5gDfB2YnfJaXWuWgRVR/POBl4H9ovV7AH8BXolugxryucblBlwJ/BN4HngQ\nuAzYD5iO7097Dtg/2nYo8I/oPX0K2BMoB1bhB27MwSf4CRk+y/Ka4zPl9Q8H5kT3B+CbYOdEz+0G\ntADeBlZHn/Mp0XF0b/R5zak51nSLzXEwFvhDhjjKo9eaHd0OjtZX4Eej1mx3J3BWdP//Rf/784Bb\nonUl8//bkA+xHHgL2KnW+mnAdzM851ngzpTl+UQJA7iZbQlln9r/vLWe0yG63y76eybwy+h+N+DV\n6H4VPrk0B3YD1gJltcuP1rWN7u8OLEt57Fp8srsTuCJDTBPw/TUA3WueHx3YP4nuW+p7Fb3OY9H9\nl9OUd1LK8jNECRO4Afjv6P6bwPjo/hk1By3+n6omeXUGFoc+sHJ+oPofI/OBVkBbYFn0fs8EukTb\nDASeju7vnPLcc4D/je5fA/ww5bE/ZPgsy2sfk8DOwBfR/bZAWXT/SOAv0f2zgDtSnnMDcFrK8/8J\n7Bj6/YzrLcBxcCvRD9M0sbQGWkb3u7Lte6iCryaUX+G/s3YFXk9ZX/N9VjL/v1kNGzaznfAZdJxz\n7rOU9VcCG13dV/N9KNp2Z6C9c+6FaP0DQE1bc13Dd18E7jOzyfgz4IliucrMfoz/BTEhWu+Ax50/\na/4DM1sN7JWm/B2AG83sMGArsLeZ7emcWw38HP8rZz1Q1/ksUwCcc0vMbK9o3SvAvWbWHJjinJuX\nsn1fYL6ZtQM+TlNeaoy/A8aY2Q+BU/E1pxo1J2pOwtf+wH+hdTf7TxFtzWxH59wXdcQfN4cBjzjn\nvgS+NLOp+C+VQcCfU/a9RfS3U3TMfC1a90ZKWanvtSP9Z5lO6vN2Bu43sy5RGc1Stknd7ihgqJn9\nKFpuiR/e/s+6d1cyKPRx4Mj8/dQCuNPMeuOvGdi1ntg/iWL+Pb4Z7a/R+pL5/603oURfjg/jm7Wm\npKz/Hr6teEjKunvxbcjvOueOj1Z/nqnoDK93fVSuc779+nwzGwAcB8w2sz7OuY/M7CngRHyzQp+U\nIjam3N+SYR9Pw9cY+jjntpjZm/iDkmh9G3wtpjXwhZldF72+c87VvFbq6xj+weejJHU88AczuxX4\nG76qvQfwJb4tvq2ZzcXXSt6Mykhtw30Y/wvqGXxzV+1LztSoeY7hT+jcmGG7UpDuH3sH4GPn3IFp\ntv8V/tfoX81sML72msl2n2UGB+JHE4KvyT7tnBtuZvvgm3czOck5t6yOxyV7hT4OFgGDM2x/KbDK\nOXdG1Ef8ZbR+M1/tn26FP4l8S/RdNgQ4GT9CdQgl9P9b3ygvA36Pr4LdlrK+EvgxMCz6pQCA85c7\nOTAlmcC2L9uPgY/N7JBo/WnpXtM5d2VURp/otfZzzr3inLsGWMO2Exl/B9wBvOKc+6Se/fwUXz2u\n0Q5YHX3Ah+ObxGr8Bvgpvhp6UxTTT1NjysTMOgNrnHO/i+Lr45xb45w7AN9W2x/4I/C9qLyaZPJp\nFFPNe7ABPxT61/j291QjU/6+FN1/Erg4JY4D6oozpp4DTjSzVmbWFt82/gXwppmdDP54NbNvRdu3\nA/4d3f9eSjm1j4WsRFdpuAX/BVW7/DEpm66rVf4MvvrZpPvSk+wV+jiYCAwys/8MtDCzb5vZN6Oy\n34tWn4n/EQq+a6CHmbWIWmaGAM78PE07O+emAz8Eekfbl8z/b32jvA4BTgcON7O50e0Y/D/VTsBT\n0bq76ygj9Zf3GOCu6Nd57ccyjda42czmmx9i/KJzbj6Ac24Ovgo5odb225XjnPsAeNHMFpjZTcCf\ngH5mNh/fF7EEfxyeCWxwzk3Cd571N7OKLPar5v7hwGtmNgffVHUb/GeE267OuQ/xVfMX+KpJwI+j\nIYn7RusexDfHPVlr213MbB6+Oe7SaN3F0f7MM7NFwLkZYo4t59xcfPPpPOAJfPOiw/8wOdvMXsMP\noqiZ/6YK3wQyC/9DpOYzmgYMj4YDH1pTfOpLpdzfL9pucfTatzvn7oseuxnfbDoH/0VS87xn8V8m\nc83sFHxNpnl0DC8EftbU9yLJCn0cOOfW41scLoqGDS8CfoAfeHE3cFb0mvvjB93gnFsJTI7ieAj/\nYxJ8AptbopQ5AAAAeUlEQVQW/f8+Twn+/8bqWl6pzGxv4Fnn3P6hY8mHqM29bVQzq1n3JtA3Skwi\nIkUlltfyimoS17Etw5cUM3sU2Bc4otZD8cz+IpIIsa2hiIhIcYnNtbxERKS4KaGIiEhOKKGIiEhO\nKKGIiEhOKKGIiEhOKKGIiEhO/H/UYIZSKT69yAAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Properties are more or less Zipf distributed\n", "a.plot(loglog=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 4 Test if it scales\n", "\n", "### lets do this for a 100 times larger file" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# lets do this for a 100 times larger file\n", "\n", "! head -1000000 Person.csv > Person_1M.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Smart loading\n", "\n", "\n", "```\n", "# This gives an out of memory error\n", "\n", "Persons10K= pd.read_csv('Person_1M.csv', sep=',', header=0, \n", " skiprows=[1,2,3], index_col=0, na_values='NULL', low_memory=False)\n", "\n", "P10K=Persons10K[withoutlabelcolumns].dropna(axis='index', subset=['birthDate'])\n", "len(P10K)\n", "```\n", "\n", "* We load smartly in chuncks of 10.000 lines\n", "* For each chunck we only keep the columns and rows we want.\n", "* This greatly reduces the space we need." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Compte the columns without \"label\" So the ones we want.\n", "withoutlabelcolumns = [x for x in range(len(Persons10K.columns))\n", " if not \"_label\" in Persons10K.columns[x] ]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "328" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(withoutlabelcolumns )" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 3min 34s, sys: 52.2 s, total: 4min 26s\n", "Wall time: 12min 42s\n" ] } ], "source": [ "# We do it for the full dataset\n", "# Actually, we can improve on this by NOT first unzipping but using the BZIP module (to stream in the zipped file directly)\n", "# On the full (1.6M lines) dataset, my mac started to swap like hell. RAM memory seems the problem.\n", "reader = pd.read_csv('Person.csv', sep=',', header=0, \n", " skiprows=[1,2,3], index_col=0, \n", " na_values='NULL', low_memory=False,\n", " usecols= withoutlabelcolumns, # the_columns_i_want_to_use, \n", " chunksize=10000)\n", "\n", "% time df = pd.concat([ chunk.dropna(axis='index', subset=['birthDate']) for chunk in reader ])" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "969996" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(df)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(14.219113274693916, 14.0, 4.300233671866137)" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZoAAAEACAYAAACK+7BGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGeRJREFUeJzt3X+MXeWd3/H3BxwTJ0EYAzW/DFgrp8XZRFA3OLuJkksR\nxlmt+NFFYNoFV7XSdJ1fjbpVYVeKZ4K6CaoSwm4VWjUOGGvjxS1acBrW2AFuN61kBlKTOEwcGzVe\n4QGbrMEGGm1qL5/+cZ/BZ4fx+Nr4mTue+3lJV/Pc7znPOc85mPnMOee5M7JNRERELaf0egARETG9\nJWgiIqKqBE1ERFSVoImIiKoSNBERUVWCJiIiqqoWNJLeLekpSc9KGpb0lVIfkLRb0tby+mSjzx2S\ndkraLmlJo75I0ray7J5G/TRJD5b6FkkXN5Ytl7SjvG6rdZwRETEx1fwcjaT32P6lpBnA/wR+H7gK\neN3218esuxD4DvBh4ALg+8AC25Y0BHzW9pCkR4E/tr1R0krg122vlHQzcIPtZZLmAE8Di8rmfwgs\nsr2/2sFGRMS4qt46s/3L0pwJnAq8Wt5rnNWvA9bZPmh7F/A8sFjSecDptofKeg8A15f2tcCa0n6I\nTogBXANssr2/hMtmYOmJOaqIiDgWVYNG0imSngX2Ak/afq4s+pykH0laLWl2qZ0P7G50303nymZs\nfaTUKV9fALB9CDgg6awJthUREZOs9hXNm7YvAy4EPi6pBdwLzAcuA14CvlZzDBER0VszJmMntg9I\n+h7wj2y3R+uSvgV8t7wdAeY1ul1I50pkpLTH1kf7XAS8WJ4DnWF7n6QRoNXoMw94Yuy4JOUXvUVE\nHAfb4z0CGVfNWWdnj94WkzQLuBrYKuncxmo3ANtKewOwTNJMSfOBBcCQ7T3Aa5IWSxJwK/BIo8/y\n0r4ReLy0NwFLJM2WdGbZ92PjjdN2XjarVq3q+RimyivnIuci52Li17GqeUVzHrBG0il0Am2t7ccl\nPSDpMsDAz4FPA9gelrQeGAYOASt9+IhWAvcDs4BHbW8s9dXAWkk7gX3AsrKtVyTdSWfmGcCgM+Ms\nIqInqgWN7W3APxynfsTPtNj+I+CPxqn/EPjgOPVfATcdYVv3Afcdw5AjIqKC/GaAAKDVavV6CFNG\nzsVhOReH5Vwcv6of2JzqJLmfjz8i4nhIwscwGWBSZp3Fya0zB6M38oNAxMkvQRNd6sU3/N4FXESc\nOHlGExERVSVoIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVoIiKi\nqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVoIiKiqgRNRERUVS1oJL1b\n0lOSnpU0LOkrpT5H0mZJOyRtkjS70ecOSTslbZe0pFFfJGlbWXZPo36apAdLfYukixvLlpd97JB0\nW63jjIiIiVULGtt/A1xp+zLgQ8CVkj4G3A5stv1+4PHyHkkLgZuBhcBS4JuSVDZ3L7DC9gJggaSl\npb4C2FfqdwN3lW3NAb4EXFFeq5qBFhERk6fqrTPbvyzNmcCpwKvAtcCaUl8DXF/a1wHrbB+0vQt4\nHlgs6TzgdNtDZb0HGn2a23oIuKq0rwE22d5vez+wmU54RUTEJKsaNJJOkfQssBd40vZzwFzbe8sq\ne4G5pX0+sLvRfTdwwTj1kVKnfH0BwPYh4ICksybYVkRETLIZNTdu+03gMklnAI9JunLMcktyzTEc\nzcDAwFvtVqtFq9Xq2VgiIqaidrtNu90+7v5Vg2aU7QOSvgcsAvZKOtf2nnJb7OWy2ggwr9HtQjpX\nIiOlPbY+2uci4EVJM4AzbO+TNAK0Gn3mAU+MN7Zm0ERExNuN/SF8cHDwmPrXnHV29ugDeEmzgKuB\nrcAGYHlZbTnwcGlvAJZJmilpPrAAGLK9B3hN0uIyOeBW4JFGn9Ft3UhncgHAJmCJpNmSziz7fqzS\noUZExARqXtGcB6yRdAqdQFtr+3FJW4H1klYAu4CbAGwPS1oPDAOHgJW2R2+rrQTuB2YBj9reWOqr\ngbWSdgL7gGVlW69IuhN4uqw3WCYFRETEJNPh7+X9R5L7+fi71bmQ7MV5EvnvEzH1SMK2jr5mR34z\nQEREVJWgiYiIqhI0ERFRVYImIiKqStBERERVCZqIiKgqQRMREVUlaCIioqoETUREVJWgiYiIqhI0\nERFRVYImIiKqStBERERVCZqIiKgqQRMREVUlaCIioqoETUREVJWgiYiIqhI0ERFRVYImIiKqStBE\nRERVM3o9gOiOpF4PISLiuCRoTiru0X4TchFx/KrdOpM0T9KTkp6T9BNJny/1AUm7JW0tr082+twh\naaek7ZKWNOqLJG0ry+5p1E+T9GCpb5F0cWPZckk7yuu2WscZERETk13np2RJ5wLn2n5W0vuAHwLX\nAzcBr9v++pj1FwLfAT4MXAB8H1hg25KGgM/aHpL0KPDHtjdKWgn8uu2Vkm4GbrC9TNIc4GlgUdn8\nD4FFtveP2adrHf+J1rl11ssrml7sW5ws/30i+okkbHd9q6PaFY3tPbafLe03gJ/SCRAY/17MdcA6\n2wdt7wKeBxZLOg843fZQWe8BOoEFcC2wprQfAq4q7WuATbb3l3DZDCw9YQcXk0ZST14RceJMyqwz\nSZcAlwNbSulzkn4kabWk2aV2PrC70W03nWAaWx/hcGBdALwAYPsQcEDSWRNsK0467sErIk6k6pMB\nym2z/wZ8wfYbku4FvlwW3wl8DVhRexxHMjAw8Fa71WrRarV6NZSIiCmp3W7TbrePu3+1ZzQAkt4F\n/HfgL2x/Y5zllwDftf1BSbcD2P5qWbYRWAX8FfCk7UtL/Rbg47Z/r6wzYHuLpBnAS7bPkbQMaNn+\nV6XPfwaesP3gmP3nGU13e+/RvvNsKGIqmjLPaNT5zrgaGG6GTHnmMuoGYFtpbwCWSZopaT6wABiy\nvQd4TdLiss1bgUcafZaX9o3A46W9CVgiabakM4GrgcdO+EFGRMRR1bx19lHgd4EfS9paan8A3CLp\nMjo/qv4c+DSA7WFJ64Fh4BCwsnG5sRK4H5gFPGp7Y6mvBtZK2gnsA5aVbb0i6U46M88ABsfOOIuI\niMlR9dbZVJdbZ13vvUf7zq2ziKloytw6i4iIgARNRERUlqCJiIiqEjQREVFVgiYiIqpK0ERERFUJ\nmoiIqCpBExERVSVoIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVo\nIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqKpa0EiaJ+lJSc9J+omkz5f6HEmb\nJe2QtEnS7EafOyTtlLRd0pJGfZGkbWXZPY36aZIeLPUtki5uLFte9rFD0m21jjMiIiZW84rmIPBF\n2x8APgJ8RtKlwO3AZtvvBx4v75G0ELgZWAgsBb4pSWVb9wIrbC8AFkhaWuorgH2lfjdwV9nWHOBL\nwBXltaoZaBERMXmqBY3tPbafLe03gJ8CFwDXAmvKamuA60v7OmCd7YO2dwHPA4slnQecbnuorPdA\no09zWw8BV5X2NcAm2/tt7wc20wmviIiYZJPyjEbSJcDlwFPAXNt7y6K9wNzSPh/Y3ei2m04wja2P\nlDrl6wsAtg8BBySdNcG2IiJiks042gqSzgX+PXCB7aXlFtdv2F7dzQ4kvY/O1cYXbL9++G4Y2LYk\nH9/QT4yBgYG32q1Wi1ar1bOxRERMRe12m3a7fdz9jxo0wP3AfcAflvc7gfXAUYNG0rvohMxa2w+X\n8l5J59reU26LvVzqI8C8RvcL6VyJjJT22Ppon4uAFyXNAM6wvU/SCNBq9JkHPDHeGJtBExERbzf2\nh/DBwcFj6t/NrbOzbT8I/C2A7YPAoaN1Kg/yVwPDtr/RWLQBWF7ay4GHG/VlkmZKmg8sAIZs7wFe\nk7S4bPNW4JFxtnUjnckFAJuAJZJmSzoTuBp4rItjjYiIE6ybK5o3ynMPACR9BDjQRb+PAr8L/FjS\n1lK7A/gqsF7SCmAXcBOA7WFJ64FhOkG20vbobbWVdK6sZgGP2t5Y6quBtZJ2AvuAZWVbr0i6E3i6\nrDdYJgVERMQk0+Hv5UdYQVoE/AnwAeA54BzgRts/qj+8uiT5aMc/VXQu5no11l7tu3f7PVn+XUT0\ngiRs6+hrlvW7+R+qPP/4+3RutW0vt89Oegmarvfeo30naCKmomMNmqM+o5H0Xjq3vP617W3AJZJ+\n+x2MMSIi+kg3kwHuA/4f8Jvl/Yt0pjtHREQcVTdB82u276ITNtj+v3WHFBER00k3QfMrSbNG30j6\nNeBX9YYUERHTSTfTmweAjcCFkr5DZ9ryP684poiImEYmDBpJpwBnAr9D5zcwQ+dXyfyi9sAiImJ6\n6OZzND+0vWiSxjOpMr256733aN+Z3hwxFZ3w6c3AZkm/X/6Q2ZzR1zsYY0RE9JFurmh2Mc6Plbbn\nVxrTpMkVTdd779G+c0UTMRVV+c0A01WCpuu992jfCZqIqehYg6abv0fzO7z9//YDwDbbL4/TJSIi\n4i3dTG/+F8BvAE/S+RHzE8D/BuZL+rLtByqOLyIiTnLdBM27gEtH//yypLnAWmAx8JdAgiYiIo6o\nm1ln80ZDpni51PZRfi1NRETEkXRzRfOkpO/R+fPNovPhzXb5rc75Y2IRETGhbqY3nwL8Ezq/egbg\nfwEPnTTTtSaQWWdd771H+86ss4ip6ITPOrP9pqRngAO2N0t6D/A+4PV3MM6IiOgT3fzhs38J/Ffg\nP5XShcDDNQcVERHTRzeTAT4DfAx4DcD2DuDv1RxURERMH139PRrbb/39GUkz6N3DgoiIOMl0EzT/\nQ9IfAu+RdDWd22jfrTusiIiYLrqZdXYqsAJYUkqPAd86aaZrTSCzzrree4/2nVlnEVPRCf8zAbb/\nls7D/5W2b7T9X7r97izp25L2StrWqA1I2i1pa3l9srHsDkk7JW2XtKRRXyRpW1l2T6N+mqQHS32L\npIsby5ZL2lFet3Uz3oiIOPGOGDTqGJD018DPgJ9J+mtJq9T58bob9wFLx9QMfN325eX1F2V/C4Gb\ngYWlzzcb+7kXWGF7AbBA0ug2VwD7Sv1u4K6yrTnAl4ArymuVpNldjjkiIk6gia5ovkjnQ5oftn2m\n7TPpfNP+aFl2VLZ/ALw6zqLxguo6YJ3tg7Z3Ac8DiyWdB5xue6is9wBwfWlfC6wp7YeAq0r7GmCT\n7f229wObeXvgRUTEJJgoaG4D/qntn48WbP8f4J+VZe/E5yT9SNLqxpXG+cDuxjq7gQvGqY+UOuXr\nC2Vsh4ADks6aYFsRETHJJvrNADNs/2Js0fYvyhTn43Uv8OXSvhP4Gp1bYD0xMDDwVrvVatFqtXo1\nlIiIKandbtNut4+7/0SBcfA4l02o+cfSJH2Lw1OlR4B5jVUvpHMlMlLaY+ujfS4CXizhd4btfZJG\ngFajzzzgifHG0wyaiIh4u7E/hA8ODh5T/4lunX1I0uvjvYAPHtdogfLMZdQNwOiMtA3AMkkzJc0H\nFgBDtvcAr0laXCYH3Ao80uizvLRvBB4v7U3AEkmzJZ0JXE1nWnZEREyyI17R2D71nW5c0jo6f5Hz\nbEkvAKuAlqTL6Mw++znw6bK/YUnrgWHgEJ3p1KPTqFcC9wOzgEdtbyz11cBaSTuBfcCysq1XJN0J\nPF3WGyyTAiIiYpId9QOb01k+sNn13nu073xgM2IqOuEf2IyIiHgnEjQREVFVgiYiIqpK0ERERFUJ\nmoiIqCpBExERVSVoIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVo\nIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVoIiKiqqpBI+nbkvZK\n2taozZG0WdIOSZskzW4su0PSTknbJS1p1BdJ2laW3dOonybpwVLfIunixrLlZR87JN1W8zgjIuLI\nal/R3AcsHVO7Hdhs+/3A4+U9khYCNwMLS59vSlLpcy+wwvYCYIGk0W2uAPaV+t3AXWVbc4AvAVeU\n16pmoEVExOSpGjS2fwC8OqZ8LbCmtNcA15f2dcA62wdt7wKeBxZLOg843fZQWe+BRp/mth4Crirt\na4BNtvfb3g9s5u2BFxERk2BGD/Y51/be0t4LzC3t84EtjfV2AxcAB0t71EipU76+AGD7kKQDks4q\n29o9zrYiunL4Ynpy2e7JfiNq6kXQvMW2JfX0/6yBgYG32q1Wi1ar1bOxxFTSi3+WvQm3iKNpt9u0\n2+3j7t+LoNkr6Vzbe8ptsZdLfQSY11jvQjpXIiOlPbY+2uci4EVJM4AzbO+TNAK0Gn3mAU+MN5hm\n0ERExNuN/SF8cHDwmPr3YnrzBmB5aS8HHm7Ul0maKWk+sAAYsr0HeE3S4jI54FbgkXG2dSOdyQUA\nm4AlkmZLOhO4Gnis5kFFRMT4ql7RSFoHfAI4W9ILdGaCfRVYL2kFsAu4CcD2sKT1wDBwCFjpwzes\nVwL3A7OAR21vLPXVwFpJO4F9wLKyrVck3Qk8XdYbLJMCIiJikqmfHz5K8sly/J2LuV6NtVf77r/9\nniz/HqO/ScJ21w8V85sBIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExER\nVSVoIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVoIiKiqgRNRERU\nlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqKpnQSNpl6QfS9oqaajU5kjaLGmHpE2SZjfWv0PS\nTknbJS1p1BdJ2laW3dOonybpwVLfIuniyT3CiIiA3l7RGGjZvtz2FaV2O7DZ9vuBx8t7JC0EbgYW\nAkuBb0pS6XMvsML2AmCBpKWlvgLYV+p3A3dNxkFFRMTf1etbZxrz/lpgTWmvAa4v7euAdbYP2t4F\nPA8slnQecLrtobLeA40+zW09BFx14ocfERFH0+srmu9LekbSp0ptru29pb0XmFva5wO7G313AxeM\nUx8pdcrXFwBsHwIOSJpzwo8iIiImNKOH+/6o7ZcknQNslrS9udC2Jbn2IAYGBt5qt1otWq1W7V1G\nRJxU2u027Xb7uPvLrv69/OiDkFYBbwCfovPcZk+5Lfak7X8g6XYA218t628EVgF/Vda5tNRvAT5u\n+/fKOgO2t0iaAbxk+5wx+/VUOP5udB5J9Wqsvdp3/+33ZPn3GP1NErbHPvo4op7cOpP0Hkmnl/Z7\ngSXANmADsLysthx4uLQ3AMskzZQ0H1gADNneA7wmaXGZHHAr8Eijz+i2bqQzuSAiIiZZr26dzQX+\nvEwcmwH8qe1Nkp4B1ktaAewCbgKwPSxpPTAMHAJWNi5FVgL3A7OAR21vLPXVwFpJO4F9wLLJOLCI\niPi7psSts17JrbOu996jfffffk+Wf4/R306KW2cREdE/EjQREVFVgiYiIqpK0ERERFUJmoiIqCpB\nExERVSVoIiKiqgRNRERUlaCJiIiqEjQREVFVgiYiIqpK0ERERFUJmoiIqCpBExERVSVoIiKiqgRN\nRERUlaCJiIiqEjQREVFVgiYiIqqa0esBRMRhUtd/hv2Est2T/UZ/SNBETCm9+Ibfm3CL/pFbZxER\nUdW0DhpJSyVtl7RT0r/r9XgiIvrRtA0aSacC/xFYCiwEbpF0aW9HNZW1ez2AmILa7XavhzBl5Fwc\nv2kbNMAVwPO2d9k+CPwZcF2PxzSFtXs9gJiC8s31sJyL4zedg+YC4IXG+92lFhERk2g6zzo74dN3\ndu3axfz580/0ZiN6bqJp1YODg9X2m2nV/UHT9T+0pI8AA7aXlvd3AG/avquxzvQ8+IiIymx3PS9+\nOgfNDOBnwFXAi8AQcIvtn/Z0YBERfWba3jqzfUjSZ4HHgFOB1QmZiIjJN22vaCIiYmqYzrPOjqif\nP8gp6duS9kra1qjNkbRZ0g5JmyTN7uUYJ4ukeZKelPScpJ9I+nyp9935kPRuSU9JelbSsKSvlHrf\nnYtRkk6VtFXSd8v7vjwXknZJ+nE5F0Oldkznou+CJh/k5D46x950O7DZ9vuBx8v7fnAQ+KLtDwAf\nAT5T/i303fmw/TfAlbYvAz4EXCnpY/ThuWj4AjDM4Rms/XouDLRsX277ilI7pnPRd0FDn3+Q0/YP\ngFfHlK8F1pT2GuD6SR1Uj9jeY/vZ0n4D+Cmdz1r16/n4ZWnOpPNc81X69FxIuhD4LeBbHP6to315\nLoqxM8yO6Vz0Y9Dkg5xvN9f23tLeC8zt5WB6QdIlwOXAU/Tp+ZB0iqRn6Rzzk7afo0/PBXA38G+B\nNxu1fj0XBr4v6RlJnyq1YzoX03bW2QQy+2ECtt1vny+S9D7gIeALtl9vfnixn86H7TeByySdATwm\n6coxy/viXEj6beBl21sltcZbp1/ORfFR2y9JOgfYLGl7c2E356Ifr2hGgHmN9/PoXNX0s72SzgWQ\ndB7wco/HM2kkvYtOyKy1/XAp9+35ALB9APgesIj+PBe/CVwr6efAOuAfS1pLf54LbL9Uvv4C+HM6\njx+O6Vz0Y9A8AyyQdImkmcDNwIYej6nXNgDLS3s58PAE604b6ly6rAaGbX+jsajvzoeks0dnDkma\nBVwNbKUPz4XtP7A9z/Z8YBnwhO1b6cNzIek9kk4v7fcCS4BtHOO56MvP0Uj6JPANDn+Q8ys9HtKk\nkbQO+ARwNp17q18CHgHWAxcBu4CbbO/v1RgnS5lV9ZfAjzl8S/UOOr9Foq/Oh6QP0nmoe0p5rbX9\nHyTNoc/ORZOkTwD/xva1/XguJM2ncxUDnUctf2r7K8d6LvoyaCIiYvL0462ziIiYRAmaiIioKkET\nERFVJWgiIqKqBE1ERFSVoImIiKoSNBERUVWCJiIiqvr/WpO0NJHP9/oAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "a=df.count(axis=1)\n", "a.sort(ascending=False) \n", "a.plot(kind='hist');\n", "a.mean(), a.median(), a.std()" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(20797, 901571, 969996)" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# number of rows witha spouse\n", "len(df.spouse.dropna()), len(df.description.dropna()), len(df.birthDate.dropna())" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "URI\n", "http://dbpedia.org/resource/Ingeborg_Moen_Borgerud Norwegian businesswoman and politician\n", "http://dbpedia.org/resource/Ingeborg_Nilsson Figure skater\n", "http://dbpedia.org/resource/Ingeborg_Norell Finnish heroine / life saver\n", "http://dbpedia.org/resource/Ingeborg_Pehrson Danish actor\n", "http://dbpedia.org/resource/Ingeborg_Pf%C3%BCller Argentine shot putter and discus thrower\n", "http://dbpedia.org/resource/Ingeborg_Rasmussen Norwegian canoeist\n", "http://dbpedia.org/resource/Ingeborg_Refling_Hagen Norwegian writer\n", "http://dbpedia.org/resource/Ingeborg_Reichelt German soprano\n", "http://dbpedia.org/resource/Ingeborg_Roelofs Handball player\n", "http://dbpedia.org/resource/Ingeborg_S%C3%B8rensen Norwegian playmate\n", "http://dbpedia.org/resource/Ingeborg_Sch%C3%B6ner Actress\n", "http://dbpedia.org/resource/Ingeborg_Scheel Fencer\n", "http://dbpedia.org/resource/Ingeborg_Schmitz Swimmer\n", "http://dbpedia.org/resource/Ingeborg_Schwenzer German jurist\n", "http://dbpedia.org/resource/Ingeborg_Sj%C3%B6qvist Swedish diver\n", "http://dbpedia.org/resource/Ingeborg_Spangsfeldt Danish actor\n", "http://dbpedia.org/resource/Ingeborg_Steinholt politician\n", "http://dbpedia.org/resource/Ingeborg_Strandin Swedish opera singer\n", "http://dbpedia.org/resource/Ingeborg_W%C3%A6rstad Norwegian politician\n", "http://dbpedia.org/resource/Ingeborg_de_Beausacq Geographer Photographer\n", "Name: description, dtype: object" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.description.dropna().head(20)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false }, "outputs": [], "source": [ "df.to_csv('PersonDF.csv')" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-rw-r--r-- 1 admin staff 1.5G Oct 20 18:57 PersonDF.csv\r\n" ] } ], "source": [ "ls -lh PersonDF.csv" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!gzip PersonDF.csv" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }