{"id":366,"date":"2017-01-31T20:07:08","date_gmt":"2017-01-31T20:07:08","guid":{"rendered":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/?page_id=366"},"modified":"2022-10-10T09:29:23","modified_gmt":"2022-10-10T09:29:23","slug":"pure-publication-lists","status":"publish","type":"page","link":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/pure-publication-lists\/","title":{"rendered":"Pure publication lists"},"content":{"rendered":"<p>How to automatically update a list of publications from <a href=\"https:\/\/medewerker.uva.nl\/fnwi\/shared\/subsites\/bibliotheek\/nl\/a-z\/digitaal-publiceren\/digitaal-publiceren.html\">Pure<\/a> on a research group&#8217;s website?<\/p>\n<h2>Official route<\/h2>\n<p>Ask Pure people for the UUID of the research group and use UvA Dare which looks like:<\/p>\n<p><a href=\"http:\/\/dare.uva.nl\/search?org-uuid=1f129de9-e2f4-41a0-a223-94f32e993ac1&amp;smode=iframe\">http:\/\/dare.uva.nl\/search?org-uuid=1f129de9-e2f4-41a0-a223-94f32e993ac1&amp;smode=iframe<\/a><\/p>\n<p>but only validated publications will appear which I find unsatisfactory. Fortunately there is an unofficial route.<\/p>\n<h2>Unofficial route<\/h2>\n<p>Make a report in Pure, schedule it to be emailed, receive it, and process it. This will include not-yet-validated publications, but it gets messy:<\/p>\n<h3>&#8211; Make a report<\/h3>\n<p>See the video and use report type &#8216;Listing&#8217;:<br \/>\n<a href=\"http:\/\/www.atira.dk\/en\/pure\/screencasts\/how-to-get-familiar-with-reporting-4.12.html\">http:\/\/www.atira.dk\/en\/pure\/screencasts\/how-to-get-familiar-with-reporting-4.12.html<\/a><\/p>\n<h3>&#8211; Schedule it to be emailed<\/h3>\n<p>Schedule the report to be send in HTML format to a gmail account dedicated for this purpose:<br \/>\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-375\" src=\"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-content\/uploads\/SchedulePureReport.png\" alt=\"SchedulePureReport\" width=\"644\" height=\"229\" srcset=\"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-content\/uploads\/SchedulePureReport.png 644w, https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-content\/uploads\/SchedulePureReport-300x107.png 300w, https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-content\/uploads\/SchedulePureReport-624x222.png 624w\" sizes=\"(max-width: 644px) 100vw, 644px\" \/><\/p>\n<h3>&#8211; Receive it<\/h3>\n<p>Set your gmail-name and gmail-password in the script below and use it to install and configure &#8216;fetchmail&#8217;, &#8216;procmail&#8217; and &#8216;mpack&#8217;. Only tested on Ubuntu Linux.<\/p>\n<pre># based on: https:\/\/outhereinthefield.wordpress.com\/2015\/06\/14\/scripting-gmail-download-and-saving-the-attachments-with-fetchmail-pro\ncmail-and-munpack\/\n####################################\nemail=my-gmail-name\npassword=my-gmail-password\n####################################\n\n### install software\nsudo apt-get install fetchmail procmail mpack\n\n### config fetchmail\necho \"poll pop.gmail.com\nprotocol pop3\ntimeout 300\nport 995\nusername \\\"${email}@gmail.com\\\" password \\\"${password}\\\"\nkeep\nmimedecode\nssl\nsslcertck\nsslproto TLS1\nmda \\\"\/usr\/bin\/procmail -m '$HOME\/.procmailrc'\\\"\" &gt; $HOME\/.fetchmailrc\n\nchmod 700 $HOME\/.fetchmailrc\n\n### config procmail\necho \"LOGFILE=\/home\/${USER}\/.procmail.log\nMAILDIR=\/home\/${USER}\/\nVERBOSE=on\n\n:0\nMaildir\/\" &gt; $HOME\/.procmailrc\n\nmkdir -p $HOME\/Maildir\/process\nmkdir -p $HOME\/Maildir\/process\/landing\nmkdir -p $HOME\/Maildir\/process\/extract\nmkdir -p $HOME\/Maildir\/process\/store\nmkdir -p $HOME\/Maildir\/process\/archive\n<\/pre>\n<p>Then use this script in a cron job to copy the &#8216;.html&#8217; attachment to the target file (email with report is expected around 1:00 am):<\/p>\n<pre>#!\/bin\/bash\n####################################\ntargetfile=\/var\/www\/publications.html\n####################################\n\nDIR=$HOME\/Maildir\nLOG=$HOME\/Maildir\/getpublications.log\ndate +%r-%-d\/%-m\/%-y &gt;&gt; $LOG\nfetchmail\nmv $DIR\/new\/* $DIR\/process\/landing\/\ncd $DIR\/process\/landing\/\nshopt -s nullglob\nfor i in *\ndo\n  echo \"processing $i\" &gt;&gt; $LOG\n  mkdir $DIR\/process\/extract\/$i\n  cp $i $DIR\/process\/extract\/$i\/\n  echo \"saving backup $i to archive\"  &gt;&gt; $LOG\n  mv $i $DIR\/process\/archive\n  echo \"unpacking $i\" &gt;&gt; $LOG\n  munpack -C $DIR\/process\/extract\/$i -q $DIR\/process\/extract\/$i\/$i\n  find $DIR\/process\/extract\/$i -name '*.html' -exec cp {} ${targetfile} \\;\n\ndone\nshopt -u nullglob\necho \"finishing..\" &gt;&gt; $LOG\nmv $DIR\/process\/extract\/* $DIR\/process\/store\/ \necho \"done!\" &gt;&gt; $LOG\n<\/pre>\n<h3>&#8211; Process it<\/h3>\n<p>Add this to the script above to clean up the report and add links:<\/p>\n<pre># remove header and footer\nperl -i -0pe 's\/&lt;h1 class=\"ReportTitle\"&gt;.*?&lt;br&gt;\/\/igs' ${targetfile}\nperl -i -0pe '$datestring = localtime(); s\/&lt;span class=\"body\"&gt;.*?&lt;br&gt;.*?&lt;br&gt;\/&lt;span class=\"body\"&gt;updated $datestring&lt;\\\/span&gt;\/igs' ${targetfile} # insert update time\nperl -i -0pe 's\/&lt;h2 class=\"ReportElementTitle\"&gt;.*?&lt;\\\/h2&gt;\/\/igs' ${targetfile}\nperl -i -0pe 's\/&lt;p class=\"reportdescription\"&gt;.*?&lt;\\\/p&gt;\/\/igs' ${targetfile}\n\n# remove paragraph counts\nperl -i -0pe 's\/(&lt;h3 class=\"ListGroupingTitle1\"&gt;).*?\\. \/$1\/igs' ${targetfile}\n\n# add links\nperl  -i -0pe 's\/(?&lt;total&gt;&lt;strong&gt;(?&lt;title&gt;.*?)&lt;\\\/strong&gt;)\/&lt;a href=\"https:\\\/\\\/www.google.nl\\\/#q=%22$+{title}%22\"&gt;$+{total}&lt;\\\/a&gt;\/igs' ${targetfile}\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>How to automatically update a list of publications from Pure on a research group&#8217;s website? Official route Ask Pure people for the UUID of the research group and use UvA Dare which looks like: http:\/\/dare.uva.nl\/search?org-uuid=1f129de9-e2f4-41a0-a223-94f32e993ac1&amp;smode=iframe but only validated publications will appear which I find unsatisfactory. Fortunately there is an unofficial route. Unofficial route Make a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/pages\/366"}],"collection":[{"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/comments?post=366"}],"version-history":[{"count":26,"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/pages\/366\/revisions"}],"predecessor-version":[{"id":811,"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/pages\/366\/revisions\/811"}],"wp:attachment":[{"href":"https:\/\/staff.fnwi.uva.nl\/b.terwijn\/wp-json\/wp\/v2\/media?parent=366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}