Recently I came across two useful web services that make it a bit easier to write research papers: Netspeak and Detexify².
As a non-native English speaker, I often have problems with choosing the right words and I used to ask Google to help me. For example, I would formulate a query "our research * that", look for the most frequent words in the search results, and issue additional queries like "our research indicates that" and "our research shows that" to count hits.
With Netspeak, it is easier, I simply write: our research ? that and I instantly get the most popular phrases with their counts. Netspeak can also find the most popular synonyms of a given word in a given context, or find the most frequent order of given words:
Detexify² solves another small inconvenience: when I didn't remember the LaTeX instruction for a less-common math. symbol, I needed to consult looong lists of symbols and corresponding instructions. Now I can simply draw the symbol and Detexify² will tell me the instruction and the package which I need to use!
Interestingly, the back-end is written in Haskell, and its source code is available on GitHub.
Bolo's notes
Irregular entries on computer science, research, and various curiosities.
Tuesday, November 13, 2012
Wednesday, January 25, 2012
Typoglycemia in Haskell
Can you decipher the following sentences?
All hmuan biegns are bron fere and euqal in dgiinty and rgiths. Tehy are ednwoed wtih raeosn and cnocseicne and sohlud act tworads one aonhter in a sipirt of borhtreohod.
It's the first article of the Universal Declaration of Human Rights, with letters in each word rearranged in a certain way.
Typoglycemia
In the above sentences the first and the last letter of each word stay in their places, while the remaining letters are scrambled. Such text is surprisingly easy to read, especially when reading at a high speed. It seems that our brains need the first and the last letter as anchors, but order of the letters in between is not so important. This feature is dubbed typoglycemia.
Code in Haskell
To play with the above observation, I wrote a short Haskell program which obfuscates an input text as explained above.
import Data.Char (isLetter) mapBlocks :: (a -> Bool) -> ([a] -> [a]) -> [a] -> [a] mapBlocks _ _ [] = [] mapBlocks p f xs = f as ++ bs ++ mapBlocks p f zs where (as, ys) = span p xs (bs, zs) = break p ys obfuscateSwap :: [a] -> [a] obfuscateSwap [] = [] obfuscateSwap (c:cs) = c : ob cs where ob (a:b:c:ds) = b : a : ob (c : ds) ob xs = xs main = interact $ mapBlocks isLetter obfuscateSwap
The first function, mapBlocks, takes:
- a function a -> Bool which classifies each element in a list as either "interesting" or not,
- another function [a] -> [a] which does something with a contiguous block of "interesting" elements,
- and a list of elements [a].
The result is a list of elements [a] in which all the blocks of "interesting" elements are transformed using the above-mentioned function.
The second function, obfuscateSwap, takes a list of elements and shuffles all the elements except the first and the last by swapping adjacent elements. For example: obfuscateSwap "towards" == "tworads". Here's a different implementation, which reverses the internal elements (e.g. obfuscateReverse "towards" == "tdrawos"):
obfuscateReverse :: [a] -> [a]
obfuscateReverse xs | length xs < 4 = xs
obfuscateReverse xs | otherwise = [head xs]
++ (reverse . init . tail $ xs) ++ [last xs]
Notes
A few things to notice:
- Both obfuscateSwap . obfuscateSwap and obfuscateReverse . obfuscateReverse are equivalent to id, so we can use the same function to "encrypt" and "decrypt" a message.
- I like the type of mapBlocks: there is a nice separation of work, mapBlocks processes an input list but outsources obfuscation to another function. Also, it's not limited to strings. Only after partial application of isLetter we get a function operating on strings.
- I don't like the implementation of mapBlocks: it's not tail recursive and we will run out of heap for larger inputs. It should definitely be rewritten. A straightforward tail recursive version would feature an additional accumulator.
- Results of obfuscateReverse are harder to decipher than results of obfuscateSwap.
- I'm thinking about (ab)using Parsec for an alternative implementation of mapBlocks.
Appendix
To see the difference between the two versions of obfuscate, read a bit longer text, the second article of UDHR. Here's scrambled using obfuscateSwap:
Eevyrnoe is etntield to all the rgiths and ferdemos set froth in tihs Dcealariton, wtiohut dsiitcniton of any knid, scuh as rcae, clouor, sex, lnauggae, rlegioin, ploticial or ohter oipinon, ntaoianl or scoail oirign, poreptry, brith or ohter satuts. Fruhtreomre, no dsiitcniton sahll be mdae on the bsais of the ploticial, jrusiidtcoianl or itnreanitnoal satuts of the cuotnry or treirotry to wihch a preosn blenogs, wehhter it be idnpeneednt, turst, non-slef-gvoreinng or udner any ohter lmititaoin of svoreiengty.
Here's scrambled using obfuscateReverse:
Enoyreve is eeltitnd to all the rthgis and fmodeers set ftroh in tihs Doitaralcen, wuohtit doitcnitsin of any knid, scuh as rcae, cuolor, sex, lgaugnae, roigilen, pacitilol or oehtr ooinipn, nanoital or saicol oigirn, ptrepory, btrih or oehtr sutats. Fromrehtrue, no doitcnitsin slahl be mdae on the bisas of the pacitilol, janoitcidsirul or ianoitanretnl sutats of the crtnuoy or trotirrey to wcihh a posren bgnoles, wehtehr it be inednepednt, tsurt, non-slef-gninrevog or uednr any oehtr loitatimin of stngierevoy.
Labels:
haskell
Wednesday, January 11, 2012
Assorted curiosities: Geography
Fun facts learned while clicking through Wikipedia:
- Treasure Island in Ontario, Canada is probably the largest island in a lake in an island in a lake.
- Liechtenstein and Uzbekistan are doubly landlocked countries, i.e., all the neighbouring countries are landlocked.
- Republic of Kalmykia is the only predominantly Buddhist region of Europe.
- The Door to Hell is a 70 m (230 ft) wide hole near the village of Derweze in Turkmenistan, filled with natural gas, which has been burning since 1971.
| The Door to Hell near Derweze, Turkmenistan |
Labels:
curiosities,
wikipedia
Wednesday, December 28, 2011
My first use of the state monad
While learning Haskell, I was looking for a concise implementation of a function which "reshapes" a list into a matrix. Given the number of rows r, the number of columns c, and a list vs, the function should take r*c values from the list and create a r by c matrix out of them. Here's the type:
toMatrix :: Int -> Int -> [a] -> [[a]]
First solution
First I wrote a simpler function that would split a list into chunks of a given size, like this:chunksOf :: Int -> [a] -> [[a]]
chunksOf _ [] = []
chunksOf c vs = h : (chunksOf c t)
where (h, t) = splitAt c vs
Using the above, toMatrix could be implemented this way:toMatrix r c = chunksOf c . take (r*c)I had a feeling that a function like chunksOf should be already present somewhere in the standard library, so I asked Hoogle, but to no avail. There was chunksOf in Data.Text, but it operated on Text only (I retroactively named my function after the one in Data.Text). However, Hoogle returned replicateM as well…
Second solution
… and I realized I could use it with the state monad to implement toMatrix. The state could contain the list of values yet to be consumed, and the action to be replicated could be chopping off c values from the list:splitOnce :: Int -> State [a] [a]
splitOnce c = do
s <- get
let (h, t) = splitAt c s
put t
return h
After a while I realized that the same function could be written in a much more concise form:splitOnce' :: Int -> State [a] [a] splitOnce' = state . splitAtThe solution was, therefore:
toMatrix r = evalState . replicateM r . state . splitAt
Summary
The second version is good enough for me and as a bonus it helped me understand the state monad. Note that the two implementations of toMatrix are not equivalent, as they handle lists shorter than r*c in different ways. Future work: find a concise and preferably point-free implementation of chunksOf.Update (2013-01-08)
This answer on StackOverflow contains a very nice implementation of chunksOf.
Subscribe to:
Posts (Atom)

