Wiktionary talk:Todo/Entries containing unpiped w:-links

From Wiktionary, the free dictionary
Latest comment: 11 years ago by Ruakh in topic Code
Jump to navigation Jump to search

Code

[edit]
bzip2 -d < enwiktionary-20120812-pages-articles.xml.bz2 \
| perl -w -ne \
    ' use warnings;
      use strict;
      BEGIN { $/ = "</page>" }
      next unless m/<ns>0<\/ns>/;
      next if m/<text xml:space="preserve" \/>/;
      die unless m/<text xml:space="preserve">([^<]+)<\/text>/;
      next unless $1 =~ m/\[\[w:[^|\]]*\]\]/;
      die unless m/<title>([^<]+)<\/title>/;
      print "* [[$1]]\n"
    '

RuakhTALK 00:40, 14 August 2012 (UTC)Reply