Added anonbib for use in papers list
Source: https://gitweb.torproject.org/anonbib.git Commit: b478fc493d4be2115185d94e077bf06196495417
This commit is contained in:
1268
i2p2www/anonbib/BibTeX.py
Normal file
1268
i2p2www/anonbib/BibTeX.py
Normal file
File diff suppressed because it is too large
Load Diff
39
i2p2www/anonbib/Makefile
Normal file
39
i2p2www/anonbib/Makefile
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
PYTHON=python
|
||||||
|
VERSION=0.3-dev
|
||||||
|
|
||||||
|
all:
|
||||||
|
$(PYTHON) writeHTML.py anonbib.cfg
|
||||||
|
|
||||||
|
clean:
|
||||||
|
rm -f *~ */*~ *.pyc *.pyo
|
||||||
|
|
||||||
|
update:
|
||||||
|
$(PYTHON) updateCache.py anonbib.cfg
|
||||||
|
$(PYTHON) rank.py anonbib.cfg
|
||||||
|
|
||||||
|
suggest:
|
||||||
|
$(PYTHON) rank.py suggest anonbib.cfg
|
||||||
|
|
||||||
|
test:
|
||||||
|
$(PYTHON) test.py
|
||||||
|
|
||||||
|
veryclean: clean
|
||||||
|
rm -f author.html date.html topic.html bibtex.html tmp.bib
|
||||||
|
|
||||||
|
TEMPLATES=_template_.html _template_bibtex.html
|
||||||
|
CSS=css/main.css css/pubs.css
|
||||||
|
BIBTEX=anonbib.bib
|
||||||
|
SOURCE=BibTeX.py config.py metaphone.py reconcile.py updateCache.py \
|
||||||
|
writeHTML.py rank.py tests.py
|
||||||
|
EXTRAS=TODO README Makefile ChangeLog anonbib.cfg gold.gif silver.gif \
|
||||||
|
upb.gif ups.gif
|
||||||
|
|
||||||
|
DISTFILES=$(TEMPLATES) $(CSS) $(BIBTEX) $(SOURCE) $(EXTRAS)
|
||||||
|
|
||||||
|
dist: clean
|
||||||
|
rm -rf anonbib-$(VERSION)
|
||||||
|
mkdir anonbib-$(VERSION)
|
||||||
|
tar cf - $(DISTFILES) | (cd anonbib-$(VERSION); tar xf -)
|
||||||
|
mkdir anonbib-$(VERSION)/cache
|
||||||
|
tar czf anonbib-$(VERSION).tar.gz anonbib-$(VERSION)
|
||||||
|
rm -rf anonbib-$(VERSION)
|
52
i2p2www/anonbib/README
Normal file
52
i2p2www/anonbib/README
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
anonbib 0.3 -- Code to generate the anonymity bibliography
|
||||||
|
|
||||||
|
Copyright (c) 2003-2008 Nick Mathewson
|
||||||
|
Based on 'PDOSBib' perl code by Eddie Kohler
|
||||||
|
|
||||||
|
This software is licensed under the GNU GPL, version 2 or later.
|
||||||
|
|
||||||
|
To use this software, you need to understand BibTeX and Python a
|
||||||
|
little. If it breaks, you get to keep both pieces. You will need
|
||||||
|
Python 2.2 or later.
|
||||||
|
|
||||||
|
To use this package:
|
||||||
|
- Get a good BibTeX file. You may want to mark it up with some of the
|
||||||
|
extra keys used in our "anonbib.bib" file. All of the additional
|
||||||
|
Bibtex keys we use have the prefix "www_"; check out anonbib.bib
|
||||||
|
for their usage.
|
||||||
|
|
||||||
|
- Edit anonbib.cfg and _template_.html and _template_bibtex.html so they
|
||||||
|
refer to your files, authors, topics, and so on.
|
||||||
|
|
||||||
|
- Run 'python updateCache.py anonbib.cfg' to create a local cache of the
|
||||||
|
papers in your bibliography based on their www_*_url entries. (By
|
||||||
|
default, the script will ignore any entries you have already cached. To
|
||||||
|
force a fresh download of a cached file, delete it.)
|
||||||
|
|
||||||
|
- Run 'python rank.py anonbib.cfg' to download Google Scholar rankings of
|
||||||
|
all the papers.
|
||||||
|
|
||||||
|
- Run 'python writeHTML.py anonbib.cfg'. Fix any errors you care about.
|
||||||
|
|
||||||
|
- Re-run these scripts when you change the bibliography.
|
||||||
|
|
||||||
|
- If you want to merge in big BibTeX files, try using the reconcile.py
|
||||||
|
script. See the comment at the start of the file for usage info.
|
||||||
|
|
||||||
|
|
||||||
|
New in 0.3:
|
||||||
|
- Support for Google Scholar rankings to denote hot/rising papers.
|
||||||
|
Implemented by George Danezis.
|
||||||
|
- Make reconcile script generate more useful output.
|
||||||
|
- Add support for multiple bibliographies generated from a single bibtex
|
||||||
|
source. This is done via 'tags' on bibtex entries. If an entry is
|
||||||
|
tagged, it appears in the corresponding bibliographies. This is good
|
||||||
|
for generating a master bibliography and one or more selected readings
|
||||||
|
lists from the same source.
|
||||||
|
- Handle more errors when downloading files.
|
||||||
|
- When fetching a paper with a .ps url, generate the .ps.gz file
|
||||||
|
automatically.
|
||||||
|
- Note an error when a crossref overrides an existing field in an entry.
|
||||||
|
- Handle the Proceedings type correctly.
|
||||||
|
- Enforce proper encoding on pages: it must be number--number.
|
||||||
|
-
|
33
i2p2www/anonbib/TODO
Normal file
33
i2p2www/anonbib/TODO
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
|
||||||
|
|
||||||
|
- More general tasks
|
||||||
|
. Know about @book
|
||||||
|
. Write unit tests for everything
|
||||||
|
. Make name parsing vaguely sane
|
||||||
|
- Maybe uncrossref in tmp.bib
|
||||||
|
- Maybe pull important papers to the start of their sections?
|
||||||
|
. Clean \{}~ when going from note to url; add \{}~ when making
|
||||||
|
note from url.
|
||||||
|
. Also clean \_ to _ and back
|
||||||
|
- Look for urls in wherepublished.
|
||||||
|
. Forgive newlines in wherepublished, note.
|
||||||
|
- When sorting by date, entries with unknown months go into a magic
|
||||||
|
"month zero" before January. Is this right?
|
||||||
|
- Strip unused features.
|
||||||
|
o Take a configuration file on the command line instead of just
|
||||||
|
importing config.py.
|
||||||
|
|
||||||
|
- Cache tasks
|
||||||
|
- Generate a list of broken links
|
||||||
|
- Re-download all cached items if requested
|
||||||
|
- Clear dead items from cache
|
||||||
|
- Use HTTP HEAD requests to decide whetherto update stale
|
||||||
|
elements in cache.
|
||||||
|
- Add ability to honor a "www_no_cache={1}" option for entries
|
||||||
|
if the authors ask us not to cache them.
|
||||||
|
- Maybe, add ability to cache images from an HTML page.
|
||||||
|
|
||||||
|
- Reconcile tasks
|
||||||
|
- Document it.
|
||||||
|
- Notice when there is new or different information of certain kinds
|
||||||
|
(pages, dates, etc) in the new information.
|
74
i2p2www/anonbib/_template_.html
Normal file
74
i2p2www/anonbib/_template_.html
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
<?xml version="1.0"?>
|
||||||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||||
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||||
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
||||||
|
<head>
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||||||
|
<meta http-equiv="Content-Style-Type" content="text/css" />
|
||||||
|
|
||||||
|
<!-- *** I AM MACHINE GENERATED! DO NOT EDIT ME!
|
||||||
|
*** EDIT THE .bib FILE or _template_.html INSTEAD!
|
||||||
|
|
||||||
|
Generated by `%(command_line)s'
|
||||||
|
(c) Eddie Kohler 1999-2000, Nick Mathewson 2003 -->
|
||||||
|
|
||||||
|
<!-- anon bib anonymity bibliography -->
|
||||||
|
<title>%(title)s</title>
|
||||||
|
|
||||||
|
<link rel="stylesheet" type="text/css" href="%(root)s/css/main.css" />
|
||||||
|
<link rel="stylesheet" type="text/css" href="%(root)s/css/pubs.css" />
|
||||||
|
|
||||||
|
</head>
|
||||||
|
<body bgcolor="#ffffff" text="#000000" link="#bb0000" vlink="#990099"
|
||||||
|
alink="#ff9900" >
|
||||||
|
|
||||||
|
<h1 align="center">%(short_title)s</h1>
|
||||||
|
<p align="center">%(otherbibs)s</p>
|
||||||
|
<p align="center">%(choices)s</p>
|
||||||
|
|
||||||
|
<!-- Table 1: contains the sidebar and the body. -->
|
||||||
|
<table cellspacing="3" cellpadding="0" border="0" align="center" width="100%%">
|
||||||
|
<tr valign="top">
|
||||||
|
<td>
|
||||||
|
|
||||||
|
<!-- Table 2: The sidebar-->
|
||||||
|
<table align="right" cellspacing="0" cellpadding="5" width="100"
|
||||||
|
class="sidebar">
|
||||||
|
<tr valign="top"><td><p class="l1"><strong>%(sectiontypes)s:</strong><br /></p>
|
||||||
|
%(sections)s
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
</table><!-- End of table 2 -->
|
||||||
|
</td>
|
||||||
|
|
||||||
|
<td width="85%%">
|
||||||
|
|
||||||
|
<h2>Publications by %(field)s</h2>
|
||||||
|
|
||||||
|
<ul class="sections">
|
||||||
|
%(entries)s
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
</td>
|
||||||
|
|
||||||
|
<td width="5%%"><br /></td>
|
||||||
|
|
||||||
|
</tr>
|
||||||
|
</table><!-- End of table 1 -->
|
||||||
|
|
||||||
|
|
||||||
|
<p class="contact">Please send new or corrected entries to
|
||||||
|
<a name="contact">
|
||||||
|
<span class="email"><anonbib@free<!-- -->haven.net></span></a>.
|
||||||
|
<br />
|
||||||
|
If you can, please format them as BibTeX; see our
|
||||||
|
<a href="bibtex.html">BibTeX source page</a> for examples.<br />
|
||||||
|
Remember to include URLs if possible:
|
||||||
|
<a href="http://citeseer.ist.psu.edu/online-nature01/">offline papers are
|
||||||
|
less useful</a>.
|
||||||
|
</p>
|
||||||
|
<p>The source code to anonbib is now in Git. See <a href="https://gitweb.torproject.org/anonbib.git">the anonbib gitweb page</a> for more information.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
28
i2p2www/anonbib/_template_bibtex.html
Normal file
28
i2p2www/anonbib/_template_bibtex.html
Normal file
@ -0,0 +1,28 @@
|
|||||||
|
<?xml version="1.0"?>
|
||||||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||||
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||||
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
||||||
|
<head>
|
||||||
|
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
|
||||||
|
<meta http-equiv="Content-Style-Type" content="text/css" />
|
||||||
|
|
||||||
|
<!-- *** I AM MACHINE GENERATED! DO NOT EDIT ME!
|
||||||
|
*** EDIT THE .bib FILE or _template_.html INSTEAD!
|
||||||
|
|
||||||
|
Generated by `%(command_line)s'
|
||||||
|
(c) Eddie Kohler 1999-2000, Nick Mathewson 2003 -->
|
||||||
|
|
||||||
|
<title>%(title)s: BibTeX</title>
|
||||||
|
|
||||||
|
<link rel="stylesheet" type="text/css" href="%(root)s/css/main.css" />
|
||||||
|
<link rel="stylesheet" type="text/css" href="%(root)s/css/pubs.css" />
|
||||||
|
|
||||||
|
</head>
|
||||||
|
<body bgcolor="#ffffff" text="#000000" link="#bb0000" vlink="#990099"
|
||||||
|
alink="#ff9900" >
|
||||||
|
|
||||||
|
<table cellspacing="15" border="0" align="center" width="100%%">
|
||||||
|
%(entries)s
|
||||||
|
</table>
|
||||||
|
</body>
|
||||||
|
</html>
|
6293
i2p2www/anonbib/anonbib.bib
Normal file
6293
i2p2www/anonbib/anonbib.bib
Normal file
File diff suppressed because it is too large
Load Diff
163
i2p2www/anonbib/anonbib.cfg
Normal file
163
i2p2www/anonbib/anonbib.cfg
Normal file
@ -0,0 +1,163 @@
|
|||||||
|
# Copyright 2003-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
# Our input filename.
|
||||||
|
MASTER_BIB = "./anonbib.bib"
|
||||||
|
|
||||||
|
# Where do we put generated HTML?
|
||||||
|
OUTPUT_DIR = "."
|
||||||
|
|
||||||
|
# Where do we put cached papers (relative to OUTPUT_DIR)
|
||||||
|
CACHE_DIR = "cache"
|
||||||
|
|
||||||
|
# Where do we cache citations papers (relative to OUTPUT_DIR)
|
||||||
|
CITE_CACHE_DIR = "cite_cache"
|
||||||
|
|
||||||
|
# Are there subsections for cached papers? This is useful for putting
|
||||||
|
# different Apache permission on different directories.
|
||||||
|
CACHE_SECTIONS = [ ]
|
||||||
|
|
||||||
|
# Only include entries that have this key. This is one way to
|
||||||
|
# generate multiple bibliographies from the same source. Currently
|
||||||
|
# deprecated in favor of tags.
|
||||||
|
#
|
||||||
|
#example: REQUIRE_KEY = "www_selected"
|
||||||
|
#
|
||||||
|
REQUIRE_KEY = None
|
||||||
|
|
||||||
|
# Timeout when downloading from a server while caching, in seconds.
|
||||||
|
DOWNLOAD_CONNECT_TIMEOUT = 15
|
||||||
|
|
||||||
|
# Template files.
|
||||||
|
TEMPLATE_FILE = "./_template_.html"
|
||||||
|
BIBTEX_TEMPLATE_FILE = "./_template_bibtex.html"
|
||||||
|
|
||||||
|
# Map from author name regex to author homepage.
|
||||||
|
AUTHOR_URLS = {
|
||||||
|
'Ross.*Anderson' : 'http://www.cl.cam.ac.uk/users/rja14/',
|
||||||
|
'Alessandro.*Acquisti' : 'http://www.heinz.cmu.edu/~acquisti/index.html',
|
||||||
|
'Agrawal' : 'http://www.research.ibm.com/people/a/agrawal/',
|
||||||
|
'Adam.*Back' : 'http://www.cypherspace.org/~adam/',
|
||||||
|
'Berthold' : 'http://page.inf.fu-berlin.de/~berthold/',
|
||||||
|
'Borisov' : 'http://hatswitch.org/~nikita/',
|
||||||
|
'Bettati' : 'http://faculty.cs.tamu.edu/bettati/',
|
||||||
|
'Miguel.*Castro' : 'http://research.microsoft.com/users/mcastro/',
|
||||||
|
'Chaum' : 'http://www.chaum.com/',
|
||||||
|
'J.*Claessens' : 'http://www.esat.kuleuven.be/~joclaess/',
|
||||||
|
'R.*Clayton' : 'http://www.cl.cam.ac.uk/~rnc1/',
|
||||||
|
'Wei Dai' : 'http://www.eskimo.com/~weidai/',
|
||||||
|
'Danezis' : 'http://homes.esat.kuleuven.be/~gdanezis/',
|
||||||
|
'Claudia.*az' : 'http://www.esat.kuleuven.be/~cdiaz/',
|
||||||
|
'Dingledine' : 'http://www.freehaven.net/~arma/cv.html',
|
||||||
|
'Desmedt' : 'http://www.cs.fsu.edu/~desmedt/',
|
||||||
|
'Douceur' : 'http://research.microsoft.com/~johndo/',
|
||||||
|
'N.*Hopper' : 'http://www-users.cs.umn.edu/~hopper/',
|
||||||
|
'Michael.*Freedman' : 'http://www.scs.cs.nyu.edu/~mfreed/',
|
||||||
|
'Gergely' : 'http://www.planeforge.com/home/tgm',
|
||||||
|
'Ian.*Goldberg' : 'http://www.cs.uwaterloo.ca/~iang/',
|
||||||
|
'Christian.*Grothoff' : 'http://grothoff.org/christian/',
|
||||||
|
'D.*Hopwood' : 'http://www.users.zetnet.co.uk/hopwood/',
|
||||||
|
'Jakobsson' : 'http://www2.parc.com/csl/members/mjakobss/markus-jakobsson.htm',
|
||||||
|
'Juels' : 'http://www.rsasecurity.com/rsalabs/staff/bios/ajuels/',
|
||||||
|
'Kaashoek' : 'http://pdos.csail.mit.edu/~kaashoek/',
|
||||||
|
'K.*Kurosawa' : 'http://kuro.cis.ibaraki.ac.jp/~kurosawa/',
|
||||||
|
'H.*Langos' : 'http://www.wh9.tu-dresden.de/~heinrich/',
|
||||||
|
'B.*Liskov' : 'http://www.pmg.lcs.mit.edu/barbara_liskov.html',
|
||||||
|
'Mathewson' : 'http://www.wangafu.net/~nickm/',
|
||||||
|
'Mazières' : 'http://www.scs.cs.nyu.edu/~dm/',
|
||||||
|
'B.*Möller' : ('http://www.informatik.tu-darmstadt.de/TI/'
|
||||||
|
'Mitarbeiter/moeller.html'),
|
||||||
|
'U.*Möller' : 'http://www.ulfm.de/',
|
||||||
|
'D.*Molnar' : 'http://www.cs.berkeley.edu/~dmolnar/',
|
||||||
|
'R.*Morris' : 'http://www.pdos.lcs.mit.edu/~rtm/',
|
||||||
|
'S.*Murdoch' : 'http://www.cl.cam.ac.uk/users/sjm217/',
|
||||||
|
'A.*Pashalidis' : 'http://www.xrtc.com/',
|
||||||
|
'A.*Pfitzmann' : 'http://dud.inf.tu-dresden.de/~pfitza/',
|
||||||
|
'B.*Pfitzmann' : 'http://www.zurich.ibm.com/~bpf/',
|
||||||
|
'B.*Preneel' : 'http://www.esat.kuleuven.be/~preneel/',
|
||||||
|
'Daniel.*Simon' : 'http://research.microsoft.com/crypto/dansimon/me.htm',
|
||||||
|
'Rackoff' : 'http://www.cs.toronto.edu/DCS/People/Faculty/rackoff.html',
|
||||||
|
'Jean F' : 'http://www.geocities.com/j_f_raymond/',
|
||||||
|
'M.*Rennhard' : 'http://www.tik.ee.ethz.ch/~rennhard/',
|
||||||
|
'M.*Reiter' : 'http://www.ece.cmu.edu/~reiter/',
|
||||||
|
'Rivest' : 'http://theory.lcs.mit.edu/~rivest/',
|
||||||
|
'Avi.*Rubin' : 'http://avirubin.com/',
|
||||||
|
'Sassaman' : 'http://homes.esat.kuleuven.be/~lsassama/',
|
||||||
|
'Serjantov' : 'http://home.arachsys.com/~aas/',
|
||||||
|
'S.*Seys' : 'http://www.esat.kuleuven.be/~sseys/',
|
||||||
|
'Shoup' : 'http://www.shoup.net/',
|
||||||
|
'Syverson' : 'http://www.syverson.org/',
|
||||||
|
'Tsudik' : 'http://www.ics.uci.edu/~gts/c.html',
|
||||||
|
'M.*Waidner' : 'http://www.zurich.ibm.com/~wmi/',
|
||||||
|
'David.*Wagner' : 'http://www.cs.berkeley.edu/~daw/',
|
||||||
|
'M.*Waldman' : 'http://cs1.cs.nyu.edu/~waldman/',
|
||||||
|
'B.*Waters' : 'http://www.cs.utexas.edu/~bwaters/',
|
||||||
|
'Chenxi.*Wang' : 'http://www.ece.cmu.edu/~chenxi/',
|
||||||
|
'M.*Wright' : 'http://ranger.uta.edu/~mwright/',
|
||||||
|
'B.*Levine' : 'http://prisms.cs.umass.edu/brian/',
|
||||||
|
'T.*Benjamin' : 'http://www.cs.umass.edu/~tshb/',
|
||||||
|
'B.*Defend' : 'http://www.cs.umass.edu/~defend/',
|
||||||
|
'K.*Fu' : 'http://www.cs.umass.edu/~kevinfu/',
|
||||||
|
'J.*Camenisch' : 'http://www.zurich.ibm.com/~jca/',
|
||||||
|
'S.*Hohenberger' : 'http://www.cs.jhu.edu/~susan/',
|
||||||
|
'M.*Kohlweiss' : 'http://homes.esat.kuleuven.be/~mkohlwei/',
|
||||||
|
'A.*Lysyanskaya' : 'http://www.cs.brown.edu/~anna/',
|
||||||
|
'M.*Meyerovich' : 'http://www.cs.brown.edu/~mira/',
|
||||||
|
'P.*Zieli.*ski' : 'http://www.cl.cam.ac.uk/~pz215/',
|
||||||
|
'S.*Zander' : 'http://caia.swin.edu.au/cv/szander/'
|
||||||
|
}
|
||||||
|
|
||||||
|
# List of paterns for author names _not_ to do an initial-tolerant
|
||||||
|
# match on when building section list. E.g., if "J\\. Smith" is in
|
||||||
|
# this list, he won't be folded into "John Smith".
|
||||||
|
NO_COLLAPSE_AUTHORS = [
|
||||||
|
|
||||||
|
]
|
||||||
|
|
||||||
|
# Map from LaTeX-style name of author to collapse to canonical name.
|
||||||
|
COLLAPSE_AUTHORS = {
|
||||||
|
"Nicholas Mathewson": "Nick Mathewson",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Map from author pattern to collation key.
|
||||||
|
# This keeps 'Zero Knowledge Systems' from getting alphabetized as "Systems,
|
||||||
|
# Zero Knowledge."
|
||||||
|
ALPHABETIZE_AUTHOR_AS = {
|
||||||
|
"Zero.*Knowledge.*Systems": "Zero Knowledge Systems",
|
||||||
|
"Carlos.*Aguilar.*Melchor": "Aguilar Melchor Carlos",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Map of strings to initialize BibTeX parsing with.
|
||||||
|
INITIAL_STRINGS = {
|
||||||
|
# SECTIONS
|
||||||
|
'sec_mix' : "Mix Networks: Design",
|
||||||
|
'sec_mixattacks' : "Mix Networks: Attacks",
|
||||||
|
'sec_stream' : "Stream-based anonymity",
|
||||||
|
'sec_traffic' : "Traffic analysis",
|
||||||
|
'sec_pub' : "Anonymous publication",
|
||||||
|
'sec_pir' : "Private Information Retrieval",
|
||||||
|
'sec_nym' : "Pseudonymity"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Don't put in any entries of this type.
|
||||||
|
OMIT_ENTRIES = ("proceedings", "journal")
|
||||||
|
|
||||||
|
# List of all recognized values for www_tags.
|
||||||
|
ALL_TAGS = ("selected", )
|
||||||
|
|
||||||
|
# Titles of page, by tag.
|
||||||
|
TAG_TITLES = { "": "The Free Haven Anonymity Bibliography",
|
||||||
|
"selected": "Free Haven's Selected Papers in Anonymity"
|
||||||
|
}
|
||||||
|
|
||||||
|
# As TAG_TITLES, but shorter.
|
||||||
|
TAG_SHORT_TITLES = { "": "Anonymity Bibliography",
|
||||||
|
"selected": "Selected Papers in Anonymity",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Directories where tag pages get generated.
|
||||||
|
TAG_DIRECTORIES = { '': "full",
|
||||||
|
"selected": "" }
|
||||||
|
|
||||||
|
# Make cached stuff group-writable. Make sure that your cache directories
|
||||||
|
# are sticky!
|
||||||
|
CACHE_UMASK = 002
|
56
i2p2www/anonbib/config.py
Normal file
56
i2p2www/anonbib/config.py
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
# Copyright 2003-2006, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
_KEYS = [ "ALL_TAGS",
|
||||||
|
"ALPHABETIZE_AUTHOR_AS","AUTHOR_URLS","CACHE_DIR","CACHE_SECTIONS",
|
||||||
|
"CACHE_UMASK",
|
||||||
|
"CITE_CACHE_DIR",
|
||||||
|
"COLLAPSE_AUTHORS",
|
||||||
|
"DOWNLOAD_CONNECT_TIMEOUT","INITIAL_STRINGS",
|
||||||
|
"MASTER_BIB", "NO_COLLAPSE_AUTHORS", "OMIT_ENTRIES",
|
||||||
|
"OUTPUT_DIR", "TEMPLATE_FILE", "BIBTEX_TEMPLATE_FILE",
|
||||||
|
"REQUIRE_KEY", "TAG_TITLES", "TAG_DIRECTORIES", "TAG_SHORT_TITLES",
|
||||||
|
]
|
||||||
|
|
||||||
|
for _k in _KEYS:
|
||||||
|
globals()[_k]=None
|
||||||
|
del _k
|
||||||
|
|
||||||
|
def load(cfgFile):
|
||||||
|
mod = {}
|
||||||
|
execfile(cfgFile, mod)
|
||||||
|
for _k in _KEYS:
|
||||||
|
try:
|
||||||
|
globals()[_k]=mod[_k]
|
||||||
|
except KeyError:
|
||||||
|
raise KeyError("Configuration option %s is missing"%_k)
|
||||||
|
|
||||||
|
INITIAL_STRINGS.update(_EXTRA_INITIAL_STRINGS)
|
||||||
|
AUTHOR_RE_LIST[:] = [
|
||||||
|
(re.compile(k, re.I), v,) for k, v in AUTHOR_URLS.items()
|
||||||
|
]
|
||||||
|
|
||||||
|
NO_COLLAPSE_AUTHORS_RE_LIST[:] = [
|
||||||
|
re.compile(pat, re.I) for pat in NO_COLLAPSE_AUTHORS
|
||||||
|
]
|
||||||
|
|
||||||
|
ALPHABETIZE_AUTHOR_AS_RE_LIST[:] = [
|
||||||
|
(re.compile(k, re.I), v,) for k,v in ALPHABETIZE_AUTHOR_AS.items()
|
||||||
|
]
|
||||||
|
|
||||||
|
_EXTRA_INITIAL_STRINGS = {
|
||||||
|
# MONTHS
|
||||||
|
'jan' : 'January', 'feb' : 'February',
|
||||||
|
'mar' : 'March', 'apr' : 'April',
|
||||||
|
'may' : 'May', 'jun' : 'June',
|
||||||
|
'jul' : 'July', 'aug' : 'August',
|
||||||
|
'sep' : 'September', 'oct' : 'October',
|
||||||
|
'nov' : 'November', 'dec' : 'December',
|
||||||
|
}
|
||||||
|
|
||||||
|
AUTHOR_RE_LIST = []
|
||||||
|
|
||||||
|
NO_COLLAPSE_AUTHORS_RE_LIST = []
|
||||||
|
|
||||||
|
ALPHABETIZE_AUTHOR_AS_RE_LIST = []
|
111
i2p2www/anonbib/css/main.css
Normal file
111
i2p2www/anonbib/css/main.css
Normal file
@ -0,0 +1,111 @@
|
|||||||
|
img {
|
||||||
|
border: 0px;
|
||||||
|
}
|
||||||
|
|
||||||
|
BODY {
|
||||||
|
background-color: #FFF;
|
||||||
|
color: #000;
|
||||||
|
margin: 0px;
|
||||||
|
}
|
||||||
|
|
||||||
|
FORM {
|
||||||
|
margin-top: 0.5em;
|
||||||
|
margin-bottom: 0.5em;
|
||||||
|
}
|
||||||
|
|
||||||
|
P, TD {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.contact {
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.contact A {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-weight: normal;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.email {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
P IMG {
|
||||||
|
vertical-align: text-bottom;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.crumbbreadth {
|
||||||
|
margin-top: 0.25em;
|
||||||
|
}
|
||||||
|
|
||||||
|
.compact {
|
||||||
|
margin-top: -0.5em;
|
||||||
|
text-indent: 0em;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.biblio {
|
||||||
|
font-style: italic;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.biblio A {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-weight: normal;
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.availability {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
font-weight: normal;
|
||||||
|
}
|
||||||
|
|
||||||
|
UL {
|
||||||
|
list-style: outside;
|
||||||
|
}
|
||||||
|
|
||||||
|
UL.expand {
|
||||||
|
margin-bottom: 1em;
|
||||||
|
}
|
||||||
|
|
||||||
|
UL.sections {
|
||||||
|
list-style: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Font-level properties */
|
||||||
|
|
||||||
|
PRE {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
}
|
||||||
|
|
||||||
|
STRONG, A {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Rockwell", "Lucida Sans Unicode", monospace;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
A:link {
|
||||||
|
color: #B00;
|
||||||
|
}
|
||||||
|
|
||||||
|
A:visited {
|
||||||
|
color: #903;
|
||||||
|
}
|
||||||
|
|
||||||
|
H1, H2, H3, H4, H5, H6 {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", "Lucida Console", Monaco, monospace;
|
||||||
|
}
|
||||||
|
|
||||||
|
H1 A, H2 A, H3 A, H4 A, H5 A, H6 A {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", "Lucida Console", Monaco, monospace;
|
||||||
|
}
|
||||||
|
|
||||||
|
H1 {
|
||||||
|
color: #00B;
|
||||||
|
}
|
||||||
|
|
||||||
|
H2 {
|
||||||
|
color: #006;
|
||||||
|
}
|
||||||
|
|
||||||
|
H3 {
|
||||||
|
color: #006;
|
||||||
|
}
|
121
i2p2www/anonbib/css/pubs.css
Normal file
121
i2p2www/anonbib/css/pubs.css
Normal file
@ -0,0 +1,121 @@
|
|||||||
|
SPAN.title {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.author {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-weight: normal;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.availability {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
font-weight: normal;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.author A {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-weight: normal;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.biblio {
|
||||||
|
font-family: lucida, "Lucida Sans Unicode", Geneva, sans-serif;
|
||||||
|
font-style: italic;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.biblio A {
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
|
||||||
|
SPAN.bad {
|
||||||
|
text-decoration: underline;
|
||||||
|
color: #000;
|
||||||
|
background-color: #FDF;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.remarks {
|
||||||
|
font-family: serif;
|
||||||
|
margin-top: 0.3em;
|
||||||
|
margin-bottom: 0;
|
||||||
|
margin-left: 5em;
|
||||||
|
padding-left: 0.5em;
|
||||||
|
border-width: 0 0 0 5px;
|
||||||
|
border-color: black;
|
||||||
|
border-style: solid;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.remarks A {
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.l1 {
|
||||||
|
margin-left: 0.5em;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.l2 {
|
||||||
|
margin-left: 1em;
|
||||||
|
margin-top: 0.3em;
|
||||||
|
margin-bottom: 0.3em;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.entry {
|
||||||
|
margin-top: 0.7em;
|
||||||
|
margin-bottom: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
DIV.impEntry {
|
||||||
|
border-width: 1px;
|
||||||
|
border-color: black;
|
||||||
|
border-style: solid;
|
||||||
|
background-color: #FFE;
|
||||||
|
padding: 0.3em;
|
||||||
|
margin-top: 0.7em;
|
||||||
|
margin-bottom: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
P.impEntry {
|
||||||
|
background-color: #FFE;
|
||||||
|
padding: 0;
|
||||||
|
margin-top: 0;
|
||||||
|
margin-bottom: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
DIV.draftEntry {
|
||||||
|
/*
|
||||||
|
border-width: 1px;
|
||||||
|
border-color: black;
|
||||||
|
border-style: solid;
|
||||||
|
padding: 0.3em;
|
||||||
|
margin-top: 0.7em;
|
||||||
|
margin-bottom: 0;
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
|
||||||
|
P.draftEntry {
|
||||||
|
color: #555;
|
||||||
|
padding: 0;
|
||||||
|
margin-top: 0;
|
||||||
|
margin-bottom: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
TABLE.sidebar {
|
||||||
|
border-width: 2px;
|
||||||
|
border-color: black;
|
||||||
|
border-style: solid;
|
||||||
|
background-color: #CFF;
|
||||||
|
}
|
||||||
|
|
||||||
|
TD.bibtex {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
border-width: 2px;
|
||||||
|
font-weight: normal;
|
||||||
|
border-color: black;
|
||||||
|
border-style: solid;
|
||||||
|
background-color: #DFF;
|
||||||
|
}
|
||||||
|
|
||||||
|
PRE.bibtex {
|
||||||
|
font-family: lucidatypewriter, "Lucida Typewriter", Monaco, "Lucida Sans Unicode", monospace;
|
||||||
|
font-size: smaller;
|
||||||
|
}
|
||||||
|
|
BIN
i2p2www/anonbib/gold.gif
Normal file
BIN
i2p2www/anonbib/gold.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 540 B |
193
i2p2www/anonbib/metaphone.py
Normal file
193
i2p2www/anonbib/metaphone.py
Normal file
@ -0,0 +1,193 @@
|
|||||||
|
#!/usr/bin/python2
|
||||||
|
# Copyright 2003-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
"""metaphone.py -- Pure-python metaphone implementation.
|
||||||
|
|
||||||
|
(This is not guaranteed to match the real metaphone algorithm; I
|
||||||
|
haven't tested it thorougly enough. Let me know if you find bugs.
|
||||||
|
|
||||||
|
Based on the original C++ metaphone implementation.)
|
||||||
|
"""
|
||||||
|
|
||||||
|
TRIPLES = {
|
||||||
|
'dge': 'j',
|
||||||
|
'dgi': 'j',
|
||||||
|
'dgy': 'j',
|
||||||
|
'sia': '+x',
|
||||||
|
'sio': '+x',
|
||||||
|
'tia': '+x',
|
||||||
|
'tio': '+x',
|
||||||
|
'tch': '',
|
||||||
|
'tha': '0',
|
||||||
|
'the': '0',
|
||||||
|
'thi': '0',
|
||||||
|
'tho': '0',
|
||||||
|
'thu': '0',
|
||||||
|
}
|
||||||
|
|
||||||
|
DOUBLES = {
|
||||||
|
'ph' : 'f',
|
||||||
|
'sh' : 'x'
|
||||||
|
}
|
||||||
|
|
||||||
|
SINGLETONS = {
|
||||||
|
'd': 't',
|
||||||
|
'f': 'f',
|
||||||
|
'j': 'j',
|
||||||
|
'l': 'l',
|
||||||
|
'm': 'm',
|
||||||
|
'n': 'n',
|
||||||
|
'r': 'r',
|
||||||
|
'p': 'p',
|
||||||
|
'q': 'k',
|
||||||
|
'v': 'f',
|
||||||
|
'x': 'ks',
|
||||||
|
'z': 's',
|
||||||
|
}
|
||||||
|
|
||||||
|
ALLCHARS = "".join(map(chr, range(256)))
|
||||||
|
NONLCCHARS = "".join([c for c in ALLCHARS if not c.islower()])
|
||||||
|
def metaphone(s):
|
||||||
|
"""Return the metaphone equivalent of a provided string"""
|
||||||
|
s = s.lower()
|
||||||
|
s = s.translate(ALLCHARS, NONLCCHARS)
|
||||||
|
|
||||||
|
if not s: return ""
|
||||||
|
|
||||||
|
# If ae, gn, kn, pn, wr then drop the first letter.
|
||||||
|
if s[:2] in ("ae", "gn", "kn", "pn", "wr"):
|
||||||
|
s = s[1:]
|
||||||
|
|
||||||
|
# Change "x" to "s"
|
||||||
|
if s[0] == 'x':
|
||||||
|
s = "s%s" % s[1:]
|
||||||
|
|
||||||
|
# Get rid of "h" in "wh".
|
||||||
|
if s[:2] == 'wh':
|
||||||
|
s = "w%s" % s[1:]
|
||||||
|
|
||||||
|
# Get rid of s from end.
|
||||||
|
if s[-1] == 's':
|
||||||
|
s = s[:-1]
|
||||||
|
|
||||||
|
result = []
|
||||||
|
prevLtr = ' '
|
||||||
|
vowelBefore = 0
|
||||||
|
lastChar = len(s)-1
|
||||||
|
for idx in range(len(s)):
|
||||||
|
curLtr = s[idx]
|
||||||
|
# If first char is a vowel, keep it.
|
||||||
|
if curLtr in "aeiou":
|
||||||
|
if idx == 0:
|
||||||
|
result.append(curLtr)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip double letters.
|
||||||
|
if idx < lastChar:
|
||||||
|
if curLtr == s[idx+1]:
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
r = TRIPLES[s[idx:idx+3]]
|
||||||
|
if r == "+x":
|
||||||
|
if idx > 1:
|
||||||
|
result.append("x")
|
||||||
|
continue
|
||||||
|
else:
|
||||||
|
result.append(r)
|
||||||
|
continue
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
try:
|
||||||
|
r = DOUBLES[s[idx:idx+2]]
|
||||||
|
result.append(r)
|
||||||
|
continue
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
try:
|
||||||
|
r = SINGLETONS[s[idx]]
|
||||||
|
result.append(r)
|
||||||
|
continue
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
if idx > 0:
|
||||||
|
prevLtr = s[idx-1]
|
||||||
|
vowelBefore = prevLtr in "aeiou"
|
||||||
|
curLtr = s[idx]
|
||||||
|
|
||||||
|
nextLtr2 = ' '
|
||||||
|
if idx < lastChar:
|
||||||
|
nextLtr = s[idx+1]
|
||||||
|
vowelAfter = nextLtr in "aeiou"
|
||||||
|
frontvAfter = nextLtr in "eiy"
|
||||||
|
if idx+1 < lastChar:
|
||||||
|
nextLtr2 = s[idx+2]
|
||||||
|
else:
|
||||||
|
nextLtr = ' '
|
||||||
|
vowelAfter = frontvAfter = 0
|
||||||
|
|
||||||
|
|
||||||
|
if curLtr == 'b':
|
||||||
|
if idx == lastChar and prevLtr == 'm':
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
result.append(curLtr)
|
||||||
|
elif curLtr == 'c':
|
||||||
|
# silent 'sci', 'sce, 'scy', 'sci', etc OK.
|
||||||
|
if not (prevLtr == 's' and frontvAfter):
|
||||||
|
if nextLtr in 'ia':
|
||||||
|
result.append("x")
|
||||||
|
elif frontvAfter:
|
||||||
|
result.append("s")
|
||||||
|
elif prevLtr == 's' and nextLtr == 'h':
|
||||||
|
result.append('k')
|
||||||
|
elif nextLtr == 'h':
|
||||||
|
if idx == 0 and nextLtr2 in "aeiou":
|
||||||
|
result.append('k')
|
||||||
|
else:
|
||||||
|
result.append('x')
|
||||||
|
elif prevLtr == 'c':
|
||||||
|
result.append('c')
|
||||||
|
else:
|
||||||
|
result.append('k')
|
||||||
|
elif curLtr == 'g':
|
||||||
|
if (idx < lastChar-1) and nextLtr == 'h':
|
||||||
|
pass
|
||||||
|
elif s[idx:] == 'gned':
|
||||||
|
pass
|
||||||
|
elif s[idx:] == 'gn':
|
||||||
|
pass
|
||||||
|
elif prevLtr == 'd' and frontvAfter:
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
hard = (prevLtr == 'g')
|
||||||
|
if frontvAfter and not hard:
|
||||||
|
result.append('j')
|
||||||
|
else:
|
||||||
|
result.append('k')
|
||||||
|
elif curLtr == 'h':
|
||||||
|
if prevLtr in 'csptg':
|
||||||
|
pass
|
||||||
|
elif vowelBefore and not vowelAfter:
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
result.append('h')
|
||||||
|
elif curLtr == 'k':
|
||||||
|
if prevLtr != 'c': result.append('k')
|
||||||
|
elif curLtr in 'wy':
|
||||||
|
if vowelAfter:
|
||||||
|
result.append(curLtr)
|
||||||
|
|
||||||
|
return "".join(result)
|
||||||
|
|
||||||
|
def demo(a):
|
||||||
|
print a, "=>", metaphone(a)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
demo("Nick. Mathewson")
|
||||||
|
|
||||||
|
demo("joe schmidt")
|
||||||
|
demo("Beethoven")
|
||||||
|
|
||||||
|
demo("Because the world is round")
|
202
i2p2www/anonbib/rank.py
Normal file
202
i2p2www/anonbib/rank.py
Normal file
@ -0,0 +1,202 @@
|
|||||||
|
# Make rankings of papers and authors for automatic classification of content hotness
|
||||||
|
|
||||||
|
# Google Scholar address
|
||||||
|
# http://scholar.google.com/scholar?as_epq=
|
||||||
|
|
||||||
|
# Take care of the caching setup
|
||||||
|
cache_expire = 60*60*24*30 # 30 days
|
||||||
|
|
||||||
|
# Checks
|
||||||
|
import config
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from os.path import exists, isdir, join, getmtime
|
||||||
|
from os import listdir, remove
|
||||||
|
|
||||||
|
def remove_old():
|
||||||
|
# Remove all old cached files
|
||||||
|
filenames = listdir(cache_folder())
|
||||||
|
from time import time
|
||||||
|
now = time()
|
||||||
|
for f in filenames:
|
||||||
|
pf = join(cache_folder(), f)
|
||||||
|
time_mt = getmtime(pf)
|
||||||
|
if now - time_mt > cache_expire: # 30 days
|
||||||
|
remove(pf)
|
||||||
|
|
||||||
|
def cache_folder():
|
||||||
|
r = join(config.OUTPUT_DIR, config.CITE_CACHE_DIR)
|
||||||
|
if not exists(r):
|
||||||
|
os.makedirs(r)
|
||||||
|
assert isdir(r)
|
||||||
|
return r
|
||||||
|
|
||||||
|
import re
|
||||||
|
from urllib2 import urlopen, build_opener
|
||||||
|
from urllib import quote
|
||||||
|
from datetime import date
|
||||||
|
import hashlib
|
||||||
|
|
||||||
|
# A more handy hash
|
||||||
|
def md5h(s):
|
||||||
|
m = hashlib.md5()
|
||||||
|
m.update(s)
|
||||||
|
return m.hexdigest()
|
||||||
|
|
||||||
|
format_tested = 0
|
||||||
|
|
||||||
|
def getPageForTitle(title, cache=True, update=True, save=True):
|
||||||
|
#Returns (citation-count, scholar url) tuple, or (None,None)
|
||||||
|
global format_tested
|
||||||
|
if not format_tested and update:
|
||||||
|
format_tested = 1
|
||||||
|
TestScholarFormat()
|
||||||
|
|
||||||
|
# Do not assume that the title is clean
|
||||||
|
title = re.sub("\s+", " ", title)
|
||||||
|
title = re.sub("[^'a-zA-Z0-9\. \-\/:]", "", title)
|
||||||
|
title = re.sub("'\/", " ", title)
|
||||||
|
|
||||||
|
# We rely on google scholar to return the article with this exact title
|
||||||
|
gurl = "http://scholar.google.com/scholar?as_q=&as_epq=%s&as_occt=title"
|
||||||
|
|
||||||
|
url = gurl % quote(title)
|
||||||
|
|
||||||
|
# Access cache or network
|
||||||
|
if exists(join(cache_folder(), md5h(url))) and cache:
|
||||||
|
return url, file(join(cache_folder(), md5h(url)),'r').read()
|
||||||
|
elif update:
|
||||||
|
print "Downloading rank for %r."%title
|
||||||
|
|
||||||
|
# Make a custom user agent (so that we are not filtered by Google)!
|
||||||
|
opener = build_opener()
|
||||||
|
opener.addheaders = [('User-agent', 'Anon.Bib.0.1')]
|
||||||
|
|
||||||
|
print "connecting..."
|
||||||
|
connection = opener.open(url)
|
||||||
|
print "reading"
|
||||||
|
page = connection.read()
|
||||||
|
print "done"
|
||||||
|
if save:
|
||||||
|
file(join(cache_folder(), md5h(url)),'w').write(page)
|
||||||
|
return url, page
|
||||||
|
else:
|
||||||
|
return url, None
|
||||||
|
|
||||||
|
def getCite(title, cache=True, update=True, save=True):
|
||||||
|
url, page = getPageForTitle(title, cache=cache, update=update, save=save)
|
||||||
|
if not page:
|
||||||
|
return None,None
|
||||||
|
|
||||||
|
# Check if it finds any articles
|
||||||
|
if len(re.findall("did not match any articles", page)) > 0:
|
||||||
|
return (None, None)
|
||||||
|
|
||||||
|
# Kill all tags!
|
||||||
|
cpage = re.sub("<[^>]*>", "", page)
|
||||||
|
|
||||||
|
# Add up all citations
|
||||||
|
s = sum([int(x) for x in re.findall("Cited by ([0-9]*)", cpage)])
|
||||||
|
return (s, url)
|
||||||
|
|
||||||
|
def getPaperURLs(title, cache=True, update=True, save=True):
|
||||||
|
url, page = getPageForTitle(title, cache=cache, update=update, save=save)
|
||||||
|
if not page:
|
||||||
|
return []
|
||||||
|
pages = re.findall(r'\&\#x25ba\;.*class=fl href="([^"]*)"', page)
|
||||||
|
return pages
|
||||||
|
|
||||||
|
def get_rank_html(title, years=None, base_url=".", update=True,
|
||||||
|
velocity=False):
|
||||||
|
s,url = getCite(title, update=update)
|
||||||
|
|
||||||
|
# Paper cannot be found
|
||||||
|
if s is None:
|
||||||
|
return ''
|
||||||
|
|
||||||
|
html = ''
|
||||||
|
|
||||||
|
url = url.replace("&","&")
|
||||||
|
|
||||||
|
# Hotness
|
||||||
|
H,h = 50,5
|
||||||
|
if s >= H:
|
||||||
|
html += '<a href="%s"><img src="%s/gold.gif" alt="More than %s citations on Google Scholar" title="More than %s citations on Google Scholar" /></a>' % (url,base_url,H,H)
|
||||||
|
elif s >= h:
|
||||||
|
html += '<a href="%s"><img src="%s/silver.gif" alt="More than %s citations on Google Scholar" title="More than %s citations on Google Scholar" /></a>' % (url,base_url,h,h)
|
||||||
|
|
||||||
|
# Only include the velocity if asked.
|
||||||
|
if velocity:
|
||||||
|
# Velocity
|
||||||
|
d = date.today().year - int(years)
|
||||||
|
if d >= 0:
|
||||||
|
if 2 < s / (d +1) < 10:
|
||||||
|
html += '<img src="%s/ups.gif" />' % base_url
|
||||||
|
if 10 <= s / (d +1):
|
||||||
|
html += '<img src="%s/upb.gif" />' % base_url
|
||||||
|
|
||||||
|
return html
|
||||||
|
|
||||||
|
def TestScholarFormat():
|
||||||
|
# We need to ensure that Google Scholar does not change its page format under our feet
|
||||||
|
# Use some cases to check if all is good
|
||||||
|
print "Checking google scholar formats..."
|
||||||
|
stopAndGoCites = getCite("Stop-and-Go MIXes: Providing Probabilistic Anonymity in an Open System", False)[0]
|
||||||
|
dragonCites = getCite("Mixes protected by Dragons and Pixies: an empirical study", False, save=False)[0]
|
||||||
|
|
||||||
|
if stopAndGoCites in (0, None):
|
||||||
|
print """OOPS.\n
|
||||||
|
It looks like Google Scholar changed their URL format or their output format.
|
||||||
|
I went to count the cites for the Stop-and-Go MIXes paper, and got nothing."""
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
if dragonCites != None:
|
||||||
|
print """OOPS.\n
|
||||||
|
It looks like Google Scholar changed their URL format or their output format.
|
||||||
|
I went to count the cites for a fictitious paper, and found some."""
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
def urlIsUseless(u):
|
||||||
|
if u.find("freehaven.net/anonbib/") >= 0:
|
||||||
|
# Our own cache is not the primary citation for anything.
|
||||||
|
return True
|
||||||
|
elif u.find("owens.mit.edu") >= 0:
|
||||||
|
# These citations only work for 'members of the MIT community'.
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
return False
|
||||||
|
|
||||||
|
URLTYPES=[ "pdf", "ps", "txt", "ps_gz", "html" ]
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
# First download the bibliography file.
|
||||||
|
import BibTeX
|
||||||
|
suggest = False
|
||||||
|
if sys.argv[1] == 'suggest':
|
||||||
|
suggest = True
|
||||||
|
del sys.argv[1]
|
||||||
|
|
||||||
|
config.load(sys.argv[1])
|
||||||
|
if config.CACHE_UMASK != None:
|
||||||
|
os.umask(config.CACHE_UMASK)
|
||||||
|
bib = BibTeX.parseFile(config.MASTER_BIB)
|
||||||
|
remove_old()
|
||||||
|
|
||||||
|
print "Downloading missing ranks."
|
||||||
|
for ent in bib.entries:
|
||||||
|
getCite(ent['title'], cache=True, update=True)
|
||||||
|
|
||||||
|
if suggest:
|
||||||
|
for ent in bib.entries:
|
||||||
|
haveOne = False
|
||||||
|
for utype in URLTYPES:
|
||||||
|
if ent.has_key("www_%s_url"%utype):
|
||||||
|
haveOne = True
|
||||||
|
break
|
||||||
|
if haveOne:
|
||||||
|
continue
|
||||||
|
print ent.key, "has no URLs given."
|
||||||
|
urls = [ u for u in getPaperURLs(ent['title']) if not urlIsUseless(u) ]
|
||||||
|
for u in urls:
|
||||||
|
print "\t", u
|
||||||
|
|
292
i2p2www/anonbib/reconcile.py
Normal file
292
i2p2www/anonbib/reconcile.py
Normal file
@ -0,0 +1,292 @@
|
|||||||
|
#!/usr/bin/python2
|
||||||
|
# Copyright 2003-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
"""Code to determine which entries are new and which are old.
|
||||||
|
|
||||||
|
To scan a new file, run "python reconcile.py anonbib.cfg new-file.bib". This
|
||||||
|
will generate a new bibtex file called 'tmp.bib', with all the new entries
|
||||||
|
cleaned up a little, and all the duplicate entries commented out.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import re
|
||||||
|
|
||||||
|
assert sys.version_info[:3] >= (2,2,0)
|
||||||
|
|
||||||
|
import BibTeX
|
||||||
|
import config
|
||||||
|
import metaphone
|
||||||
|
|
||||||
|
_MPCACHE = {}
|
||||||
|
def soundsLike(s1, s2):
|
||||||
|
c = _MPCACHE
|
||||||
|
s1 = clean(s1)
|
||||||
|
s2 = clean(s2)
|
||||||
|
try:
|
||||||
|
m1 = c[s1]
|
||||||
|
except KeyError:
|
||||||
|
m1 = c[s1] = metaphone.metaphone(s1)
|
||||||
|
try:
|
||||||
|
m2 = c[s2]
|
||||||
|
except KeyError:
|
||||||
|
m2 = c[s2] = metaphone.metaphone(s2)
|
||||||
|
|
||||||
|
return m1 == m2
|
||||||
|
|
||||||
|
def mphone(s):
|
||||||
|
c = _MPCACHE
|
||||||
|
s = clean(s)
|
||||||
|
try:
|
||||||
|
return c[s]
|
||||||
|
except:
|
||||||
|
m = c[s] = metaphone.metaphone(s)
|
||||||
|
return m
|
||||||
|
|
||||||
|
def clean(s):
|
||||||
|
s = re.sub(r'\s+', ' ', s)
|
||||||
|
s = s.strip()
|
||||||
|
return s
|
||||||
|
|
||||||
|
class MasterBibTeX(BibTeX.BibTeX):
|
||||||
|
def __init__(self):
|
||||||
|
BibTeX.BibTeX.__init__(self)
|
||||||
|
|
||||||
|
def buildIndex(self):
|
||||||
|
self.byTitle = {}
|
||||||
|
for ent in self.entries:
|
||||||
|
for t in self._titleForms(ent['title']):
|
||||||
|
self.byTitle.setdefault(t, []).append(ent)
|
||||||
|
|
||||||
|
def _titleForms(self, title):
|
||||||
|
title = title.lower()
|
||||||
|
title = re.sub(r'\b(an|a|the|of)\b', "", title)
|
||||||
|
title = clean(title)
|
||||||
|
res = [ mphone(title) ]
|
||||||
|
if ':' in title:
|
||||||
|
for t in title.split(":"):
|
||||||
|
res.append(mphone(t.strip()))
|
||||||
|
#print "%r\n => %s" % (title,res)
|
||||||
|
return res
|
||||||
|
|
||||||
|
def _titlesAlike(self, t1, t2):
|
||||||
|
t1 = clean(t1)
|
||||||
|
t2 = clean(t2)
|
||||||
|
if t1 == t2:
|
||||||
|
return 2
|
||||||
|
tf1 = self._titleForms(t1)
|
||||||
|
tf2 = self._titleForms(t2)
|
||||||
|
for t in tf1:
|
||||||
|
if t in tf2: return 1
|
||||||
|
return 0
|
||||||
|
|
||||||
|
def _authorsAlike(self, a1, a2):
|
||||||
|
if not soundsLike(" ".join(a1.last)," ".join(a2.last)):
|
||||||
|
return 0
|
||||||
|
|
||||||
|
if (a1.first == a2.first and a1.von == a2.von
|
||||||
|
and a1.jr == a2.jr):
|
||||||
|
return 2
|
||||||
|
|
||||||
|
|
||||||
|
if soundsLike(" ".join(a1.first), " ".join(a2.first)):
|
||||||
|
return 1
|
||||||
|
|
||||||
|
if not a1.first or not a2.first:
|
||||||
|
return 1
|
||||||
|
|
||||||
|
if self._initialize(a1.first) == self._initialize(a2.first):
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
def _initialize(self, name):
|
||||||
|
name = " ".join(name).lower()
|
||||||
|
name = re.sub(r'([a-z])[a-z\.]*', r'\1', name)
|
||||||
|
name = clean(name)
|
||||||
|
return name
|
||||||
|
|
||||||
|
def _authorListsAlike(self, a1, a2):
|
||||||
|
if len(a1) != len(a2):
|
||||||
|
return 0
|
||||||
|
a1 = [ (a.last, a) for a in a1 ]
|
||||||
|
a2 = [ (a.last, a) for a in a2 ]
|
||||||
|
a1.sort()
|
||||||
|
a2.sort()
|
||||||
|
if len(a1) != len(a2):
|
||||||
|
return 0
|
||||||
|
r = 2
|
||||||
|
for (_, a1), (_, a2) in zip(a1,a2):
|
||||||
|
x = self._authorsAlike(a1,a2)
|
||||||
|
if not x:
|
||||||
|
return 0
|
||||||
|
elif x == 1:
|
||||||
|
r = 1
|
||||||
|
return r
|
||||||
|
|
||||||
|
def _entryDatesAlike(self, e1, e2):
|
||||||
|
try:
|
||||||
|
if clean(e1['year']) == clean(e2['year']):
|
||||||
|
return 2
|
||||||
|
else:
|
||||||
|
return 0
|
||||||
|
except KeyError:
|
||||||
|
return 1
|
||||||
|
|
||||||
|
def includes(self, ent, all=0):
|
||||||
|
title = ent['title']
|
||||||
|
candidates = []
|
||||||
|
for form in self._titleForms(title):
|
||||||
|
try:
|
||||||
|
candidates.extend(self.byTitle[form])
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
goodness = []
|
||||||
|
for knownEnt in candidates:
|
||||||
|
match = (self._entryDatesAlike(ent, knownEnt) *
|
||||||
|
self._titlesAlike(ent['title'], knownEnt['title']) *
|
||||||
|
self._authorListsAlike(ent.parsedAuthor,
|
||||||
|
knownEnt.parsedAuthor) )
|
||||||
|
if match:
|
||||||
|
goodness.append((match, knownEnt))
|
||||||
|
goodness.sort()
|
||||||
|
if all:
|
||||||
|
return goodness
|
||||||
|
if goodness:
|
||||||
|
return goodness[-1]
|
||||||
|
else:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
def demo(self):
|
||||||
|
for e in self.entries:
|
||||||
|
matches = self.includes(e, 1)
|
||||||
|
m2 = []
|
||||||
|
mids = []
|
||||||
|
for g,m in matches:
|
||||||
|
if id(m) not in mids:
|
||||||
|
mids.append(id(m))
|
||||||
|
m2.append((g,m))
|
||||||
|
matches = m2
|
||||||
|
|
||||||
|
if not matches:
|
||||||
|
print "No match for %s"%e.key
|
||||||
|
if matches[-1][1] is e:
|
||||||
|
print "%s matches for %s: OK."%(len(matches), e.key)
|
||||||
|
else:
|
||||||
|
print "%s matches for %s: %s is best!" %(len(matches), e.key,
|
||||||
|
matches[-1][1].key)
|
||||||
|
if len(matches) > 1:
|
||||||
|
for g, m in matches:
|
||||||
|
print "%%%% goodness", g
|
||||||
|
print m
|
||||||
|
|
||||||
|
|
||||||
|
def noteToURL(note):
|
||||||
|
" returns tp, url "
|
||||||
|
note = note.replace("\n", " ")
|
||||||
|
m = re.match(r'\s*(?:\\newline\s*)*\s*\\url{(.*)}\s*(?:\\newline\s*)*',
|
||||||
|
note)
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
url = m.group(1)
|
||||||
|
for suffix, tp in ((".html", "html"),
|
||||||
|
(".ps", "ps"),
|
||||||
|
(".ps.gz", "ps_gz"),
|
||||||
|
(".pdf", "pdf"),
|
||||||
|
(".txt", "txt")):
|
||||||
|
if url.endswith(suffix):
|
||||||
|
return tp,url
|
||||||
|
return "???", url
|
||||||
|
|
||||||
|
all_ok = 1
|
||||||
|
def emit(f,ent):
|
||||||
|
global all_ok
|
||||||
|
|
||||||
|
errs = ent._check()
|
||||||
|
if master.byKey.has_key(ent.key.strip().lower()):
|
||||||
|
errs.append("ERROR: Key collision with master file")
|
||||||
|
|
||||||
|
if errs:
|
||||||
|
all_ok = 0
|
||||||
|
|
||||||
|
note = ent.get("note")
|
||||||
|
if ent.getURL() and not note:
|
||||||
|
ent['note'] = "\url{%s}"%ent.getURL()
|
||||||
|
elif note:
|
||||||
|
m = re.match(r'\\url{(.*)}', note)
|
||||||
|
if m:
|
||||||
|
url = m.group(0)
|
||||||
|
tp = None
|
||||||
|
if url.endswith(".txt"):
|
||||||
|
tp = "txt"
|
||||||
|
elif url.endswith(".ps.gz"):
|
||||||
|
tp = "ps_gz"
|
||||||
|
elif url.endswith(".ps"):
|
||||||
|
tp = "ps_gz"
|
||||||
|
elif url.endswith(".pdf"):
|
||||||
|
tp = "pdf"
|
||||||
|
elif url.endswith(".html"):
|
||||||
|
tp = "html"
|
||||||
|
if tp:
|
||||||
|
ent['www_%s_url'%tp] = url
|
||||||
|
|
||||||
|
if errs:
|
||||||
|
all_ok = 0
|
||||||
|
for e in errs:
|
||||||
|
print >>f, "%%%%", e
|
||||||
|
|
||||||
|
print >>f, ent.format(77, 4, v=1, invStrings=invStrings)
|
||||||
|
|
||||||
|
def emitKnown(f, ent, matches):
|
||||||
|
print >>f, "%% Candidates are:", ", ".join([e.key for g,e in matches])
|
||||||
|
print >>f, "%%"
|
||||||
|
print >>f, "%"+(ent.format(77,4,1,invStrings).replace("\n", "\n%"))
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
if len(sys.argv) != 3:
|
||||||
|
print "reconcile.py expects 2 arguments"
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
config.load(sys.argv[1])
|
||||||
|
|
||||||
|
print "========= Scanning master =========="
|
||||||
|
master = MasterBibTeX()
|
||||||
|
master = BibTeX.parseFile(config.MASTER_BIB, result=master)
|
||||||
|
master.buildIndex()
|
||||||
|
|
||||||
|
print "========= Scanning new file ========"
|
||||||
|
try:
|
||||||
|
fn = sys.argv[2]
|
||||||
|
input = BibTeX.parseFile(fn)
|
||||||
|
except BibTeX.ParseError, e:
|
||||||
|
print "Error parsing %s: %s"%(fn,e)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
f = open('tmp.bib', 'w')
|
||||||
|
keys = input.newStrings.keys()
|
||||||
|
keys.sort()
|
||||||
|
for k in keys:
|
||||||
|
v = input.newStrings[k]
|
||||||
|
print >>f, "@string{%s = {%s}}"%(k,v)
|
||||||
|
|
||||||
|
invStrings = input.invStrings
|
||||||
|
|
||||||
|
for e in input.entries:
|
||||||
|
if not (e.get('title') and e.get('author')):
|
||||||
|
print >>f, "%%\n%%%% Not enough information to search for a match: need title and author.\n%%"
|
||||||
|
emit(f, e)
|
||||||
|
continue
|
||||||
|
|
||||||
|
matches = master.includes(e, all=1)
|
||||||
|
if not matches:
|
||||||
|
print >>f, "%%\n%%%% This entry is probably new: No match found.\n%%"
|
||||||
|
emit(f, e)
|
||||||
|
else:
|
||||||
|
print >>f, "%%"
|
||||||
|
print >>f, "%%%% Possible match found for this entry; max goodness",\
|
||||||
|
matches[-1][0], "\n%%"
|
||||||
|
emitKnown(f, e, matches)
|
||||||
|
|
||||||
|
if not all_ok:
|
||||||
|
print >>f, "\n\n\nErrors remain; not finished.\n"
|
||||||
|
|
||||||
|
f.close()
|
BIN
i2p2www/anonbib/silver.gif
Normal file
BIN
i2p2www/anonbib/silver.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 539 B |
1742
i2p2www/anonbib/testbib/pdos.bib
Normal file
1742
i2p2www/anonbib/testbib/pdos.bib
Normal file
File diff suppressed because it is too large
Load Diff
86
i2p2www/anonbib/tests.py
Normal file
86
i2p2www/anonbib/tests.py
Normal file
@ -0,0 +1,86 @@
|
|||||||
|
#!/usr/bin/python2
|
||||||
|
# Copyright 2004-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
"""Unit tests for anonbib."""
|
||||||
|
|
||||||
|
import BibTeX
|
||||||
|
import metaphone
|
||||||
|
#import reconcile
|
||||||
|
#import writeHTML
|
||||||
|
#import updateCache
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
|
||||||
|
class MetaphoneTests(unittest.TestCase):
|
||||||
|
def testMetaphone(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
class BibTeXTests(unittest.TestCase):
|
||||||
|
def testTranslation(self):
|
||||||
|
ut = BibTeX.url_untranslate
|
||||||
|
self.assertEquals(ut("Fred"),"Fred")
|
||||||
|
self.assertEquals(ut("Hello, World."), "Hello_2c_20World.")
|
||||||
|
|
||||||
|
te = BibTeX.TeXescapeURL
|
||||||
|
ute = BibTeX.unTeXescapeURL
|
||||||
|
self.assertEquals(te("http://example/~me/my_file"),
|
||||||
|
r"http://example/\{}~me/my\_file")
|
||||||
|
self.assertEquals(ute(r"http:{}//example/\{}~me/my\_file"),
|
||||||
|
"http://example/~me/my_file")
|
||||||
|
|
||||||
|
h = BibTeX.htmlize
|
||||||
|
self.assertEquals(h("Hello, world"), "Hello, world")
|
||||||
|
self.assertEquals(h(r"\'a\`e\'{i}(\'\i)\"o&\^u"),
|
||||||
|
"áèí(í)ö&"
|
||||||
|
"û")
|
||||||
|
self.assertEquals(h(r"\~n and \c{c}"), "ñ and ç")
|
||||||
|
self.assertEquals(h(r"\AE---a ligature"), "Æ—a ligature")
|
||||||
|
self.assertEquals(h(r"{\it 33}"), " 33")
|
||||||
|
self.assertEquals(h(r"Pages 33--99 or vice--versa?"),
|
||||||
|
"Pages 33-99 or vice–versa?")
|
||||||
|
|
||||||
|
t = BibTeX.txtize
|
||||||
|
self.assertEquals(t("Hello, world"), "Hello, world")
|
||||||
|
self.assertEquals(t(r"\'a\`e\'{i}(\'\i)\"o&\^u"),
|
||||||
|
"aei(i)o&u")
|
||||||
|
self.assertEquals(t(r"\~n and \c{c}"), "n and c")
|
||||||
|
self.assertEquals(t(r"\AE---a ligature"), "AE---a ligature")
|
||||||
|
self.assertEquals(t(r"{\it 33}"), " 33")
|
||||||
|
self.assertEquals(t(r"Pages 33--99 or vice--versa?"),
|
||||||
|
"Pages 33--99 or vice--versa?")
|
||||||
|
|
||||||
|
def authorsParseTo(self,authors,result):
|
||||||
|
pa = BibTeX.parseAuthor(authors)
|
||||||
|
self.assertEquals(["|".join(["+".join(item) for item in
|
||||||
|
[a.first,a.von,a.last,a.jr]])
|
||||||
|
for a in pa],
|
||||||
|
result)
|
||||||
|
|
||||||
|
def testAuthorParsing(self):
|
||||||
|
pa = BibTeX.parseAuthor
|
||||||
|
PA = BibTeX.ParsedAuthor
|
||||||
|
apt = self.authorsParseTo
|
||||||
|
|
||||||
|
apt("Nick A. Mathewson and Roger Dingledine",
|
||||||
|
["Nick+A.||Mathewson|", "Roger||Dingledine|"])
|
||||||
|
apt("John van Neumann", ["John|van|Neumann|"])
|
||||||
|
apt("P. Q. Z. de la Paz", ["P.+Q.+Z.|de+la|Paz|"])
|
||||||
|
apt("Cher", ["||Cher|"])
|
||||||
|
apt("Smith, Bob", ["Bob||Smith|"])
|
||||||
|
apt("de Smith, Bob", ["Bob|de|Smith|"])
|
||||||
|
apt("de Smith, Bob Z", ["Bob+Z|de|Smith|"])
|
||||||
|
#XXXX Fix this.
|
||||||
|
#apt("Roberts Smith Wilkins, Bob Z", ["Bob+Z||Smith+Wilkins|"])
|
||||||
|
apt("Smith, Jr, Bob", ["Bob||Smith|Jr"])
|
||||||
|
|
||||||
|
#XXXX Fix this.
|
||||||
|
#apt("R Jones, Jr.", ["R||Jones|Jr."])
|
||||||
|
apt("Smith, Bob and John Smith and Last,First",
|
||||||
|
["Bob||Smith|", "John||Smith|", "First||Last|"])
|
||||||
|
apt("Bob Smith and John Smith and John Doe",
|
||||||
|
["Bob||Smith|", "John||Smith|", "John||Doe|"])
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
||||||
|
|
BIN
i2p2www/anonbib/upb.gif
Normal file
BIN
i2p2www/anonbib/upb.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 555 B |
169
i2p2www/anonbib/updateCache.py
Executable file
169
i2p2www/anonbib/updateCache.py
Executable file
@ -0,0 +1,169 @@
|
|||||||
|
#!/usr/bin/python
|
||||||
|
# Copyright 2003-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
"""Download files in bibliography into a local cache.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import signal
|
||||||
|
import time
|
||||||
|
import gzip
|
||||||
|
|
||||||
|
import BibTeX
|
||||||
|
import config
|
||||||
|
import urllib2
|
||||||
|
import getopt
|
||||||
|
import socket
|
||||||
|
import errno
|
||||||
|
import httplib
|
||||||
|
|
||||||
|
FILE_TYPES = [ "txt", "html", "pdf", "ps", "ps.gz", "abstract" ]
|
||||||
|
BIN_FILE_TYPES = [ 'pdf', 'ps.gz' ]
|
||||||
|
|
||||||
|
class UIError(Exception):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def tryUnlink(fn):
|
||||||
|
try:
|
||||||
|
os.unlink(fn)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
def getCacheFname(key, ftype, section):
|
||||||
|
return BibTeX.smartJoin(config.OUTPUT_DIR,config.CACHE_DIR,
|
||||||
|
section,
|
||||||
|
"%s.%s"%(key,ftype))
|
||||||
|
|
||||||
|
def downloadFile(key, ftype, section, url,timeout=None):
|
||||||
|
if timeout is None:
|
||||||
|
timeout = config.DOWNLOAD_CONNECT_TIMEOUT
|
||||||
|
fname = getCacheFname(key, ftype, section)
|
||||||
|
parent = os.path.split(fname)[0]
|
||||||
|
if not os.path.exists(parent):
|
||||||
|
os.makedirs(parent)
|
||||||
|
|
||||||
|
fnameTmp = fname+".tmp"
|
||||||
|
fnameURL = fname+".url"
|
||||||
|
tryUnlink(fnameTmp)
|
||||||
|
|
||||||
|
def sigalrmHandler(sig,_):
|
||||||
|
pass
|
||||||
|
signal.signal(signal.SIGALRM, sigalrmHandler)
|
||||||
|
signal.alarm(timeout)
|
||||||
|
try:
|
||||||
|
try:
|
||||||
|
infile = urllib2.urlopen(url)
|
||||||
|
except httplib.InvalidURL, e:
|
||||||
|
raise UIError("Invalid URL %s: %s"%(url,e))
|
||||||
|
except IOError, e:
|
||||||
|
raise UIError("Cannot connect to url %s: %s"%(url,e))
|
||||||
|
except socket.error, e:
|
||||||
|
if getattr(e,"errno",-1) == errno.EINTR:
|
||||||
|
raise UIError("Connection timed out to url %s"%url)
|
||||||
|
else:
|
||||||
|
raise UIError("Error connecting to %s: %s"%(url, e))
|
||||||
|
finally:
|
||||||
|
signal.alarm(0)
|
||||||
|
|
||||||
|
mode = 'w'
|
||||||
|
if ftype in BIN_FILE_TYPES:
|
||||||
|
mode = 'wb'
|
||||||
|
outfile = open(fnameTmp, mode)
|
||||||
|
try:
|
||||||
|
while 1:
|
||||||
|
s = infile.read(1<<16)
|
||||||
|
if not s: break
|
||||||
|
outfile.write(s)
|
||||||
|
finally:
|
||||||
|
infile.close()
|
||||||
|
outfile.close()
|
||||||
|
|
||||||
|
urlfile = open(fnameURL, 'w')
|
||||||
|
print >>urlfile, time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
|
||||||
|
if "\n" in url: url = url.replace("\n", " ")
|
||||||
|
print >>urlfile, url
|
||||||
|
urlfile.close()
|
||||||
|
|
||||||
|
os.rename(fnameTmp, fname)
|
||||||
|
|
||||||
|
def getURLs(entry):
|
||||||
|
r = {}
|
||||||
|
for ftype in FILE_TYPES:
|
||||||
|
ftype2 = ftype.replace(".", "_")
|
||||||
|
url = entry.get("www_%s_url"%ftype2)
|
||||||
|
if url:
|
||||||
|
r[ftype] = url.strip().replace("\n", " ")
|
||||||
|
return r
|
||||||
|
|
||||||
|
def getCachedURL(key, ftype, section):
|
||||||
|
fname = getCacheFname(key, ftype, section)
|
||||||
|
urlFname = fname+".url"
|
||||||
|
if not os.path.exists(fname) or not os.path.exists(urlFname):
|
||||||
|
return None
|
||||||
|
f = open(urlFname, 'r')
|
||||||
|
lines = f.readlines()
|
||||||
|
f.close()
|
||||||
|
if len(lines) != 2:
|
||||||
|
print >>sys.stderr, "ERROR: unexpected number of lines in", urlFname
|
||||||
|
return lines[1].strip()
|
||||||
|
|
||||||
|
def downloadAll(bibtex, missingOnly=0):
|
||||||
|
"""returns list of tuples of key, ftype, url, error"""
|
||||||
|
errors = []
|
||||||
|
for e in bibtex.entries:
|
||||||
|
urls = getURLs(e)
|
||||||
|
key = e.key
|
||||||
|
section = e.get("www_cache_section", ".")
|
||||||
|
for ftype, url in urls.items():
|
||||||
|
if missingOnly:
|
||||||
|
cachedURL = getCachedURL(key, ftype, section)
|
||||||
|
if cachedURL == url:
|
||||||
|
print >>sys.stderr,"Skipping",url
|
||||||
|
continue
|
||||||
|
elif cachedURL is not None:
|
||||||
|
print >>sys.stderr,"URL for %s.%s has changed"%(key,ftype)
|
||||||
|
else:
|
||||||
|
print >>sys.stderr,"I have no copy of %s.%s"%(key,ftype)
|
||||||
|
try:
|
||||||
|
downloadFile(key, ftype, section, url)
|
||||||
|
print "Downloaded",url
|
||||||
|
except UIError, e:
|
||||||
|
print >>sys.stderr, str(e)
|
||||||
|
errors.append((key,ftype,url,str(e)))
|
||||||
|
except (IOError, socket.error), e:
|
||||||
|
msg = "Error downloading %s: %s"%(url,str(e))
|
||||||
|
print >>sys.stderr, msg
|
||||||
|
errors.append((key,ftype,url,msg))
|
||||||
|
if urls.has_key("ps") and not urls.has_key("ps.gz"):
|
||||||
|
# Say, this is something we'd like to have gzipped locally.
|
||||||
|
psFname = getCacheFname(key, "ps", section)
|
||||||
|
psGzFname = getCacheFname(key, "ps.gz", section)
|
||||||
|
if os.path.exists(psFname) and not os.path.exists(psGzFname):
|
||||||
|
# This is something we haven't gzipped yet.
|
||||||
|
print "Compressing a copy of",psFname
|
||||||
|
outf = gzip.GzipFile(psGzFname, "wb")
|
||||||
|
inf = open(psFname, "rb")
|
||||||
|
while 1:
|
||||||
|
s = inf.read(4096)
|
||||||
|
if not s:
|
||||||
|
break
|
||||||
|
outf.write(s)
|
||||||
|
outf.close()
|
||||||
|
inf.close()
|
||||||
|
|
||||||
|
return errors
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
if len(sys.argv) == 2:
|
||||||
|
print "Loading from %s"%sys.argv[1]
|
||||||
|
else:
|
||||||
|
print >>sys.stderr, "Expected a single configuration file as an argument"
|
||||||
|
sys.exit(1)
|
||||||
|
config.load(sys.argv[1])
|
||||||
|
|
||||||
|
if config.CACHE_UMASK != None:
|
||||||
|
os.umask(config.CACHE_UMASK)
|
||||||
|
|
||||||
|
bib = BibTeX.parseFile(config.MASTER_BIB)
|
||||||
|
downloadAll(bib,missingOnly=1)
|
BIN
i2p2www/anonbib/ups.gif
Normal file
BIN
i2p2www/anonbib/ups.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 536 B |
41
i2p2www/anonbib/venue-checklist.txt
Normal file
41
i2p2www/anonbib/venue-checklist.txt
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
This file is to keep track of which volumes of which publications have
|
||||||
|
been combed for anonymity papers and which we still have to add.
|
||||||
|
|
||||||
|
=== DONE:
|
||||||
|
|
||||||
|
ExampleConference (through 2008)
|
||||||
|
|
||||||
|
PETS 2000-2003
|
||||||
|
|
||||||
|
=== CLAIMED:
|
||||||
|
|
||||||
|
PETS 2000-2010 -- Nick (claimed 6/16)
|
||||||
|
ESORICS 1990-2010 -- Nick (claimed 6/16)
|
||||||
|
CCS -- George (claimed 6/17)
|
||||||
|
USENIX Security ("Oakland") -- George (claimed 6/17)
|
||||||
|
|
||||||
|
=== SHOULD DO:
|
||||||
|
|
||||||
|
Infohiding
|
||||||
|
IEEE Security and privacy
|
||||||
|
NDSS
|
||||||
|
WPES
|
||||||
|
WEIS
|
||||||
|
Financial Crypto
|
||||||
|
Eurocrypt
|
||||||
|
Asiacrypt
|
||||||
|
|
||||||
|
Search: Papers that cite Chaum's paper
|
||||||
|
Search: Papers that cite the Tor paper
|
||||||
|
Search: Papers that cite the original onion routing papers
|
||||||
|
Search: Papers mentioning "anonymity" or "anonymous"
|
||||||
|
Search: Papers mentioning "mixnet" or "mix-net"
|
||||||
|
|
||||||
|
=== UNDERSERVED CONTENT; PLEASE SUGGEST SEARCHES AND VENUES
|
||||||
|
|
||||||
|
Private information retrieval; PIR
|
||||||
|
Anti-censorship; censorship
|
||||||
|
Location privacy
|
||||||
|
Anonymous credentials
|
||||||
|
Anonymizing data
|
||||||
|
Secure multiparty computation
|
246
i2p2www/anonbib/writeHTML.py
Executable file
246
i2p2www/anonbib/writeHTML.py
Executable file
@ -0,0 +1,246 @@
|
|||||||
|
#!/usr/bin/python
|
||||||
|
# Copyright 2003-2008, Nick Mathewson. See LICENSE for licensing info.
|
||||||
|
|
||||||
|
"""Generate indices by author, topic, date, and BibTeX key."""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import re
|
||||||
|
import os
|
||||||
|
import json
|
||||||
|
|
||||||
|
assert sys.version_info[:3] >= (2,2,0)
|
||||||
|
os.umask(022)
|
||||||
|
|
||||||
|
import BibTeX
|
||||||
|
import config
|
||||||
|
|
||||||
|
def getTemplate(name):
|
||||||
|
f = open(name)
|
||||||
|
template = f.read()
|
||||||
|
f.close()
|
||||||
|
template_s, template_e = template.split("%(entries)s")
|
||||||
|
return template_s, template_e
|
||||||
|
|
||||||
|
def pathLength(s):
|
||||||
|
n = 0
|
||||||
|
while s:
|
||||||
|
parent, leaf = os.path.split(s)
|
||||||
|
if leaf != '' and leaf != '.':
|
||||||
|
n += 1
|
||||||
|
s = parent
|
||||||
|
return n
|
||||||
|
|
||||||
|
def writeBody(f, sections, section_urls, cache_path, base_url):
|
||||||
|
'''f: an open file
|
||||||
|
sections: list of (sectionname, [list of BibTeXEntry])
|
||||||
|
section_urls: map from sectionname to external url'''
|
||||||
|
for s, entries in sections:
|
||||||
|
u = section_urls.get(s)
|
||||||
|
sDisp = re.sub(r'\s+', ' ', s.strip())
|
||||||
|
sDisp = sDisp.replace(" ", " ")
|
||||||
|
if u:
|
||||||
|
print >>f, ('<li><h3><a name="%s"></a><a href="%s">%s</a></h3>'%(
|
||||||
|
(BibTeX.url_untranslate(s), u, sDisp)))
|
||||||
|
else:
|
||||||
|
print >>f, ('<li><h3><a name="%s">%s</a></h3>'%(
|
||||||
|
BibTeX.url_untranslate(s),sDisp))
|
||||||
|
print >>f, "<ul class='expand'>"
|
||||||
|
for e in entries:
|
||||||
|
print >>f, e.to_html(cache_path=cache_path, base_url=base_url)
|
||||||
|
print >>f, "</ul></li>"
|
||||||
|
|
||||||
|
def writeHTML(f, sections, sectionType, fieldName, choices,
|
||||||
|
tag, config, cache_url_path, section_urls={}):
|
||||||
|
"""sections: list of (sectionname, [list of BibTeXEntry])'''
|
||||||
|
sectionType: str
|
||||||
|
fieldName: str
|
||||||
|
choices: list of (choice, url)"""
|
||||||
|
|
||||||
|
title = config.TAG_TITLES[tag]
|
||||||
|
short_title = config.TAG_SHORT_TITLES[tag]
|
||||||
|
#
|
||||||
|
secStr = []
|
||||||
|
for s, _ in sections:
|
||||||
|
hts = re.sub(r'\s+', ' ', s.strip())
|
||||||
|
hts = s.replace(" ", " ")
|
||||||
|
secStr.append("<p class='l2'><a href='#%s'>%s</a></p>\n"%
|
||||||
|
((BibTeX.url_untranslate(s),hts)))
|
||||||
|
secStr = "".join(secStr)
|
||||||
|
|
||||||
|
#
|
||||||
|
tagListStr = []
|
||||||
|
st = config.TAG_SHORT_TITLES.keys()
|
||||||
|
st.sort()
|
||||||
|
root = "../"*pathLength(config.TAG_DIRECTORIES[tag])
|
||||||
|
if root == "": root = "."
|
||||||
|
for t in st:
|
||||||
|
name = config.TAG_SHORT_TITLES[t]
|
||||||
|
if t == tag:
|
||||||
|
tagListStr.append(name)
|
||||||
|
else:
|
||||||
|
url = BibTeX.smartJoin(root, config.TAG_DIRECTORIES[t], "date.html")
|
||||||
|
tagListStr.append("<a href='%s'>%s</a>"%(url, name))
|
||||||
|
tagListStr = " | ".join(tagListStr)
|
||||||
|
|
||||||
|
#
|
||||||
|
choiceStr = []
|
||||||
|
for choice, url in choices:
|
||||||
|
if url:
|
||||||
|
choiceStr.append("<a href='%s'>%s</a>"%(url, choice))
|
||||||
|
else:
|
||||||
|
choiceStr.append(choice)
|
||||||
|
|
||||||
|
choiceStr = (" | ".join(choiceStr))
|
||||||
|
|
||||||
|
fields = { 'command_line' : "",
|
||||||
|
'sectiontypes' : sectionType,
|
||||||
|
'choices' : choiceStr,
|
||||||
|
'field': fieldName,
|
||||||
|
'sections' : secStr,
|
||||||
|
'otherbibs' : tagListStr,
|
||||||
|
'title': title,
|
||||||
|
'short_title': short_title,
|
||||||
|
"root" : root,
|
||||||
|
}
|
||||||
|
|
||||||
|
header, footer = getTemplate(config.TEMPLATE_FILE)
|
||||||
|
print >>f, header%fields
|
||||||
|
writeBody(f, sections, section_urls, cache_path=cache_url_path,
|
||||||
|
base_url=root)
|
||||||
|
print >>f, footer%fields
|
||||||
|
|
||||||
|
def jsonDumper(obj):
|
||||||
|
if isinstance(obj, BibTeX.BibTeXEntry):
|
||||||
|
e = obj.entries.copy()
|
||||||
|
e['key'] = obj.key
|
||||||
|
return e
|
||||||
|
else:
|
||||||
|
raise TypeError("Do not know how to serialize %s"%(obj.__class,))
|
||||||
|
|
||||||
|
def writePageSet(config, bib, tag):
|
||||||
|
if tag:
|
||||||
|
bib_entries = [ b for b in bib.entries
|
||||||
|
if tag in b.get('www_tags', "").split() ]
|
||||||
|
else:
|
||||||
|
bib_entries = bib.entries[:]
|
||||||
|
|
||||||
|
if not bib_entries:
|
||||||
|
print >>sys.stderr, "No entries with tag %r; skipping"%tag
|
||||||
|
return
|
||||||
|
|
||||||
|
tagdir = config.TAG_DIRECTORIES[tag]
|
||||||
|
outdir = os.path.join(config.OUTPUT_DIR, tagdir)
|
||||||
|
cache_url_path = BibTeX.smartJoin("../"*pathLength(tagdir),
|
||||||
|
config.CACHE_DIR)
|
||||||
|
if not os.path.exists(outdir):
|
||||||
|
os.makedirs(outdir, 0755)
|
||||||
|
##### Sorted views:
|
||||||
|
|
||||||
|
## By topic.
|
||||||
|
|
||||||
|
entries = BibTeX.sortEntriesBy(bib_entries, "www_section", "ZZZZZZZZZZZZZZ")
|
||||||
|
entries = BibTeX.splitSortedEntriesBy(entries, "www_section")
|
||||||
|
if entries[-1][0].startswith("<span class='bad'>"):
|
||||||
|
entries[-1] = ("Miscellaneous", entries[-1][1])
|
||||||
|
|
||||||
|
entries = [ (s, BibTeX.sortEntriesByDate(ents))
|
||||||
|
for s, ents in entries
|
||||||
|
]
|
||||||
|
|
||||||
|
f = open(os.path.join(outdir,"topic.html"), 'w')
|
||||||
|
writeHTML(f, entries, "Topics", "topic",
|
||||||
|
(("By topic", None),
|
||||||
|
("By date", "./date.html"),
|
||||||
|
("By author", "./author.html")
|
||||||
|
),
|
||||||
|
tag=tag, config=config,
|
||||||
|
cache_url_path=cache_url_path)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
## By date.
|
||||||
|
|
||||||
|
entries = BibTeX.sortEntriesByDate(bib_entries)
|
||||||
|
entries = BibTeX.splitSortedEntriesBy(entries, 'year')
|
||||||
|
for idx in -1, -2:
|
||||||
|
if entries[idx][0].startswith("<span class='bad'>"):
|
||||||
|
entries[idx] = ("Unknown", entries[idx][1])
|
||||||
|
elif entries[idx][0].startswith("forthcoming"):
|
||||||
|
entries[idx] = ("Forthcoming", entries[idx][1])
|
||||||
|
sections = [ ent[0] for ent in entries ]
|
||||||
|
|
||||||
|
first_year = int(entries[0][1][0]['year'])
|
||||||
|
try:
|
||||||
|
last_year = int(entries[-1][1][0].get('year'))
|
||||||
|
except ValueError:
|
||||||
|
last_year = int(entries[-2][1][0].get('year'))
|
||||||
|
|
||||||
|
years = map(str, range(first_year, last_year+1))
|
||||||
|
if entries[-1][0] == 'Unknown':
|
||||||
|
years.append("Unknown")
|
||||||
|
|
||||||
|
f = open(os.path.join(outdir,"date.html"), 'w')
|
||||||
|
writeHTML(f, entries, "Years", "date",
|
||||||
|
(("By topic", "./topic.html"),
|
||||||
|
("By date", None),
|
||||||
|
("By author", "./author.html")
|
||||||
|
),
|
||||||
|
tag=tag, config=config,
|
||||||
|
cache_url_path=cache_url_path)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
## By author
|
||||||
|
entries, url_map = BibTeX.splitEntriesByAuthor(bib_entries)
|
||||||
|
|
||||||
|
f = open(os.path.join(outdir,"author.html"), 'w')
|
||||||
|
writeHTML(f, entries, "Authors", "author",
|
||||||
|
(("By topic", "./topic.html"),
|
||||||
|
("By date", "./date.html"),
|
||||||
|
("By author", None),
|
||||||
|
),
|
||||||
|
tag=tag, config=config,
|
||||||
|
cache_url_path=cache_url_path,
|
||||||
|
section_urls=url_map)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
## The big BibTeX file
|
||||||
|
|
||||||
|
entries = bib_entries[:]
|
||||||
|
entries = [ (ent.key, ent) for ent in entries ]
|
||||||
|
entries.sort()
|
||||||
|
entries = [ ent[1] for ent in entries ]
|
||||||
|
|
||||||
|
## Finding the root directory is done by writeHTML(), but
|
||||||
|
## the BibTeX file doesn't use that, so repeat the code here
|
||||||
|
root = "../"*pathLength(config.TAG_DIRECTORIES[tag])
|
||||||
|
if root == "": root = "."
|
||||||
|
|
||||||
|
header,footer = getTemplate(config.BIBTEX_TEMPLATE_FILE)
|
||||||
|
f = open(os.path.join(outdir,"bibtex.html"), 'w')
|
||||||
|
print >>f, header % { 'command_line' : "",
|
||||||
|
'title': config.TAG_TITLES[tag],
|
||||||
|
'root': root }
|
||||||
|
for ent in entries:
|
||||||
|
print >>f, (
|
||||||
|
("<tr><td class='bibtex'><a name='%s'>%s</a>"
|
||||||
|
"<pre class='bibtex'>%s</pre></td></tr>")
|
||||||
|
%(BibTeX.url_untranslate(ent.key), ent.key, ent.format(90,8,1)))
|
||||||
|
print >>f, footer
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
f = open(os.path.join(outdir,"bibtex.json"), 'w')
|
||||||
|
json.dump(entries, f, default=jsonDumper)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
if len(sys.argv) == 2:
|
||||||
|
print "Loading from %s"%sys.argv[1]
|
||||||
|
else:
|
||||||
|
print >>sys.stderr, "Expected a single configuration file as an argument"
|
||||||
|
sys.exit(1)
|
||||||
|
config.load(sys.argv[1])
|
||||||
|
|
||||||
|
bib = BibTeX.parseFile(config.MASTER_BIB)
|
||||||
|
|
||||||
|
for tag in config.TAG_DIRECTORIES.keys():
|
||||||
|
writePageSet(config, bib, tag)
|
Reference in New Issue
Block a user