![]() |
A deciphering problem - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Homework (https://python-forum.io/forum-9.html) +--- Thread: A deciphering problem (/thread-34626.html) |
RE: A deciphering problem - ibreeden - Aug-17-2021 Hi @Yoriz, I am impressed by your solution and by how quick you implemented it. And by the beauty of simplicity. But I must admit I also have troubles in understanding the code. I now see I have to study more on the subject of decorators in general and dataclasses in particular. One drawback of your solution is that you guessed I would use the same list of shift-keys, which was right. Thank you for your encoded answer, which deals all about good (Python)code. I am a bit reluctant to show my solution because I am afraid it is not as nice as yours. Perhaps it is because I am Dutch (as stated in your encoded message "Although that way may not be obvious at first unless you're Dutch") ![]() @Truman, we should not forget you are the one who asked the question. Are you making progress? I plan to publish and explain my solution tomorrow but I have to beautify it before I dare to show it. And perhaps someone else has a better solution. RE: A deciphering problem - jamesaarr - Aug-18-2021 (Aug-17-2021, 06:19 AM)DPaul Wrote: The keys are indeed (32, 22, ...) or (34, 44,...) depending on adding or subtracting. The movie based on real life Alan Turing :') Big Up Britain! Haha! Would highly recommending looking into how he broke the enigma cypher, I've seen people make their own and it's quite interesting! RE: A deciphering problem - ibreeden - Aug-18-2021 Hi @Truman, Did you make progress? I know the Homework section should not contain complete programs, but examples already have been posted, so I feel free to do the same. I promised to explain my ideas so here it comes. As emphasized by DPaul my algorithm is a bit fuzzy. But I like fuzzy logic. ![]() I left the program as you posted it almost un-altered. You did a good job. But instead of printing each permutation of a line to the output, each permutation is stored in an object (eat_line()). In this object the line is split to words and each word gets a score for being recognized as an English word. The average score of the line is then used as a key to store the line in a dictionary. When your program has produced all variations of a line, the line(s) with the best score is (are) printed (best_result()). As a means to find if a word is good English I installed pyspellchecker. pip install pyspellcheckerI don't know if this is the best choice. It hinders me that method "word_usage_frequency()" is not well explained. By trial end error I found that not recognized words return 0.0 and others return a (small) float. So I used this method. >>> spell = SpellChecker() >>> spell.word_usage_frequency("asked") 0.0003089325752071077 >>> spell.word_usage_frequency("askad") 0.0 >>> spell.word_usage_frequency("thus") 1.54534315113533e-05 >>> spell.word_usage_frequency("while") 0.0003726819106697944 >>> spell.word_usage_frequency("he") 0.007768271796283737 >>> spell.word_usage_frequency("it") 0.012121554730657262Now my code. from spellchecker import SpellChecker class Probability: """This stores all probable solutions of a decoded line. Each line is tested for the occurrence of good English words. eat_line(encryption_key, line) stores a line and gives it a score for the existence of English words. 0 means: nothing recognized. best_result() returns a list of lines with the best score. all_results() returns a dictionary with all lines. Intended for debugging. """ def __init__(self) -> None: """Initializes the structures to contain all probable solutions. all_sol is a dictionary. The key is the average score denoting if the words in the line are good English. 0.0 means: no English recognized. Higher scores means more words recognized. The values in the dictionary is a list. Each item of this list is again a list with two items: item[0] is the encryption_key, item[1] is the line. line_score_list is a list of scores. The same as the keys of dictionary all_sol. spell is the SpellChecker object. """ self.all_sol = {} self.line_score_list = [] self.spell = SpellChecker() def _avg(self, iterable) -> float: """_avg computes the average of a list (or any iterable) of numbers. """ return sum(iterable) / len(iterable) def eat_line(self, enc_key: int, line: str) -> None: """Consume a line of (decrypted) text. enc_key: the encryption key used. line: the line of text. The probability of the line being good English text is computed as a score. This score is used as a key to store the key and line in a dictionary. """ # word_score_list is the score for each word in the line. word_score_list = [] wordlist = self.spell.split_words(line) for word in wordlist: # Ignore words consisting of 1 letter. if len(word) > 1: word_score_list.append(self.spell.word_usage_frequency(word.lower())) avg_score = self._avg(word_score_list) self.line_score_list.append(avg_score) if avg_score in self.all_sol: self.all_sol[avg_score].append([enc_key, line]) else: # Note: add a list of lists. self.all_sol[avg_score] = [[enc_key, line]] def best_result(self) -> list: """When all permutations are "eaten", the best fit is returned. More than one line may have the same score, so a list is returned (hopefully a list with one item). Each item is a pair of encryption_key and line. """ best_score = max(self.line_score_list) return self.all_sol[best_score] def all_result(self) -> dict: """For debugging purposes: returns the complete dictionary. """ return self.all_sol yoriz_list = [ "Svr! zw!3Nz0Nsv v9N yr5N!x3BQ", "XC952v2?P2!Pux??x P?1t7P26952v2?S", "HXbeaTzXhzQTiiTgziWPczRdbeaTm3", "tRPSOHanLVnEHWWHUnWKDQnFRPSOLFDWHGq", "AgVo6dn6WZooZm6ocVi6iZnoZY9", "xG6IJ0b.Jb70KK0IbK?6Eb90EJ0e", "LYUXUVcfcns5Wiohnm8", "uD75 3.Y53G7GY3F7B'HYGD75 3.Y7BCI90YHCY4F73?YH07YFI.7Gb", "mJRFMSEFiNP?ARGA?JGRWi.C?RQiNSPGRWl", "cBB?BCUC6?E02U!3F3BU.yCCUC703!D0IX", "o86y??QyD063w3.6EQ?36y8wyxT", "qHeNB?e.9 ?eI.e9G0CAOCNS,eL?.OM?eNB?eN?GJN9NCIHeNIeAO?MMh", "DRObOucRYeVNuLOuYXO--uKXNuZbOPObKLViuYXViuYXOu--YLfSYecugKiudYuNYuSdx", "3ZhVciUVyhVOhykOmyaOmybchyPSycPjWcigyOhyTWfghyibZSggymci'fSy6ihQV2", "i0DR4.RwzAAz?RA3v9R9zCz?U", "Z E7.F67V?4G4CV8DV.5E4?V14EE4CVE7z?V*C867E*V?.HY", "d1RA3zR48 7z8z9AvA409R4.R3v?yRA0RzE 7v49,R4A'.RvRwvyR4yzvU", "7PudROuSWZVOWOXdKdSYXuScuOKciudYuOhZVKSX,uSduWKiuLOuKuQYYNuSNOKx", "Zmyq52moq5Im4qI1zqIt1zwuzsIs4qm6IupqmI--Ixq6'5Ip1Iy14qI1rI6t15qJ", ] SYMBOLS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890 !?.' print("KEY TEXT") for i in yoriz_list: prob = Probability() for key in range(len(SYMBOLS)): translated = '' for symbol in i: if symbol in SYMBOLS: symbolIndex = SYMBOLS.find(symbol) translatedIndex = symbolIndex - key if translatedIndex < 0: translatedIndex = translatedIndex + len(SYMBOLS) translated = translated + SYMBOLS[translatedIndex] else: translated = translated + symbol # Store the result in the prob object. prob.eat_line(key, translated) # Print all lines with the highest score. for res in prob.best_result(): print(f"{res[0]:3d} {res[1]}")... and here are @Yoriz 's rules: I hope you find this useful, Truman. As stated before: it is a bit fuzzy and some tweaking may be necessary. Like that I don't count one-letter words because I found they result in relative high scores.
RE: A deciphering problem - Yoriz - Aug-18-2021 @ibreeden that's it i should have included Quote:The Zen of Python, by Tim Petersfrom import this RE: A deciphering problem - DPaul - Aug-19-2021 I recognize that some very clever code has been proposed, but it is hard to see how this fits in with a starter course. I hope TS will eventually disclose what he was supposed to do, when the 'teacher?' reveals the solution. I was also challenged to provide a "better" solution, I have none that qualifies as better. I did however think of a way that the initial "brute force" approach could be limited to much less lines. This is, knowing that it is normal text, and several different lines, but the language does not matter. As it is text, there should be a 'space' every so many symbols. As you have the list of symbols, you can hardcode the index of 'space'. Then with a system of "answer = input('What do you think the space char is for this line...?")", you can try from left to right the coded letters, calculate the shift, and very soon, bingo, and you go to the next line. Works like a charm. It qualifies as "less brute force" Paul RE: A deciphering problem - Truman - Aug-21-2021 I started working on another problem in the meantime, will get back to this eventually but not sure when. I apologize if disappointed anyone. And I'm following a book, not the real teacher. RE: A deciphering problem - DPaul - Aug-22-2021 Going back to the original question by TS, this is an attempt to reduce the number of possibilities generated by the "brute force" method. It assumes plain text, so likely there are spaces between the words. I believe it will solve all Cesars, generated with this key in any language (that uses these symbols). I have replaced the symbols not found by '?' and changed the indexing system so i could do try-except. Next person to do a Cesar should use the alternate symbols list, that I was able to format correctly with the help of other posters here. ![]() Paul # import string thelist = ["qeFIP?eGSeECNNS,", "5coOMXXcoPSZIWoQI,", "avnl1olyD4l'ylDohww6DhzDjhuDil,", "z.GM?.cEQc. 70c.7KcKMKHA9AGFK,", "?MFYp2pPJJUpZSIJWpRdpMFY,", "ZqH8sl5HtqHTH4s3lyvH5zH5spH4t pHzqHlH3l5K", "Zfbi,!tif!xpvme!qspcbcmz!fbu!nfA"] # SYMBOLS = f"{string.ascii_letters}{string.digits}{string.punctuation} " # add space SYMBOLS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890 !?.' def newindex(letter, shft): try: idxnew = SYMBOLS.index(letter) + shft if idxnew >= len(SYMBOLS): idxnew = idxnew - len(SYMBOLS) return SYMBOLS[idxnew] except: return '?' idxspace = SYMBOLS.index(' ') textline = '' for line in thelist: letterIndex = 0 IcanReadIt = False while not IcanReadIt: try: shft = idxspace - SYMBOLS.index(line[letterIndex]) for letter in line: textline += newindex(letter, shft) print(f'> > > > > > > > > > > > > {textline}') answ = input('Can you read this ? (y/n) ') if answ.lower() == 'y': IcanReadIt = True textline = '' except: textline += '?' letterIndex += 1 RE: A deciphering problem - Yoriz - Aug-22-2021 I added a solver that creates a score based on the number of spaces and lower case vowels, I don't know if it works in all cases, it works on the 3 ciphers so far. import random from dataclasses import dataclass, field from typing import Generator, List SYMBOLS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890 !?." thelist = [ "qeFIP?eGSeECNNS,", "5coOMXXcoPSZIWoQI,", "avnl1olyD4l'ylDohww6DhzDjhuDil,", "z.GM?.cEQc. 70c.7KcKMKHA9AGFK,", "?MFYp2pPJJUpZSIJWpRdpMFY,", "ZqH8sl5HtqHTH4s3lyvH5zH5spH4t pHzqHlH3l5K", "Zfbi,!tif!xpvme!qspcbcmz!fbu!nfA", ] @dataclass(order=True) class CipherData: shift: str = field(compare=False) result: str = field(compare=False) score: float = field(default=0.0, compare=True) @dataclass class CipherLine: line: str cipher_data_possibilites: List[CipherData] def max_score_line(self): return max(self.cipher_data_possibilites) def top_3_score_lines(self): return sorted(self.cipher_data_possibilites, reverse=True)[:3] @dataclass class CeaserCipher: symbols: str def _shift_symbols_right(self, shift: int) -> str: return self.symbols[shift:] + self.symbols[:shift] def _shift_symbols_left(self, shift: int) -> str: shift = len(self.symbols) - shift return self.symbols[shift:] + self.symbols[:shift] def cipher(self, line: str, shift: int): shifted_sybols = self._shift_symbols_right(shift) result = line.translate(line.maketrans(self.symbols, shifted_sybols)) return CipherData(shift, result) def decipher(self, line: str, shift: int) -> str: shifted_sybols = self._shift_symbols_left(shift) result = line.translate(line.maketrans(self.symbols, shifted_sybols)) return CipherData(shift, result) def decipher_all_possibilties(self, line: str) -> Generator[CipherData, None, None]: for shift, _ in enumerate(self.symbols): yield self.decipher(line, shift) def cipher_random_shift(self, line: str) -> str: return self.cipher(line, random.randint(0, len(self.symbols))) def ceaser_cipher_solver(line: str) -> CipherLine: cipherer = CeaserCipher(SYMBOLS) cipher_lines = [] for cipher_data in cipherer.decipher_all_possibilties(line): cipher_data.score = float(cipher_data.result.count(" ")) or -10.0 cipher_data.score += sum(cipher_data.result.count(item) for item in ("aeiou")) cipher_lines.append(cipher_data) return CipherLine(line, cipher_lines) for line in thelist: cipher_line = ceaser_cipher_solver(line) print(cipher_line.max_score_line()) Edit: added from typing import List so it works with pre 3.9 python versions
RE: A deciphering problem - ibreeden - Aug-23-2021 Hi @DPaul , I like your solution. It took me a while to understand the algorithm but it is simple and efficient! You just assume the first character is a space, and from that assumption you decypher the whole line. If the user rejects the result, you assume the second character to be a space and so on. This way you find the solution after all letters of the first word have been tried. Nice! I learned from that. Now I am trying to understand the solution of Yoriz. His program looks very neat but some comment about the algorithm would have been helpful. He believes the code is self-explanatory as good code should be. The names of the classes (CipherData, CipherLine, CeaserCipher) indeed give a good impression of what it is all about, but it does not immediately make clear to me how the whole algorithm works. But never mind, I want to learn and hope to have it understood by tomorrow. RE: A deciphering problem - DPaul - Aug-23-2021 Hi ibreeden, I would like to study Yoriz' solution too, but both version 1 and 2 give a "TypeError: 'type' object is not subscriptable". Do you get it working? Paul |