The aim of Wordsmith is to assist with creating tailored, geolocation-based wordlists. As of v1.0, this is primarily based on U.S. states. Attributes for each state include roads, cities, colleges, area codes, zip codes, sports teams, and landmarks. Wordsmith also includes support for basic word mangling and filtering.
v2.0 will expand Wordsmith to include other countries, territories, and provinces.
On first run, Wordsmith will unpack some files. This will take less than 5 seconds. Alternatively you can run wordsmith.rb with the update option and download 175 MB of data from the internet.
A Gemfile has been included to simplify gem installation. These can be installed using bundle install. Alternatively, each gem can be installed manually using gem install <gem>.
Wordsmith uses data that’s been compressed in data.tar.gz. On first run, Wordsmith will unpack this to a directory called “data/” in the current working directory. This can be circumvented manually using tar -xf data.tar.gz.
Two of Wordsmith’s options, -d and -i, use CeWL to scrape words from user-supplied URLs. Wordsmith assumes the CeWL executable (cewl) is on the user’s PATH. If cewl is not found, Wordsmith will skip the URLs and continue. Instructions for installing CeWL can be found in Robin Wood’s CeWL repository: https://github.com/digininja/CeWL