Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Brute Forcing Anagrams
#11
Je vais approfondir cela et voir si je peux trouver des éléments utiles dans le code que vous avez fourni.

Pour l'instant, ma meilleure idée est de convertir chaque ligne ou chaque colonne de caractères en un compte de longueur. Par exemple, la ligne horizontale 1 (h1) pourrait être lue comme h1 = 2a, 1b, 1c, 4e, etc. Comptabiliser chaque lettre et supprimer les lettres nulles. DEVRAIS-JE LES ENREGISTRER SOUS FORME DE LISTES ?

Ensuite, je dois utiliser la même fonction pour chaque mot suivant de la liste de mots, donc le mot QUARTER pourrait être lu comme mot_1 = 1a, 1e, 1q, 2r, 1t, 1u.

Puis, il faut comparer cela à chaque ligne (POUR CHAQUE ?) h1-h15, et v1-v20. Je cherche des lignes qui peuvent accueillir le compte de caractères du mot_1 en entier.

Si mon mot_1 a 2 R, et que par exemple la ligne v3 a 3 R, je veux évidemment considérer cela comme une correspondance. Je suppose donc qu'il faut lui dire "égal ou supérieur" lorsqu'on compare le nombre de lettres du mot_1 à chaque ligne suivante.

Désolé pour les réponses qui disaient que j'étais trop vague. J'espère que cette explication supplémentaire clarifie mieux ce que je cherche à faire initialement.

Mod APKS
buran write Jan-10-2025, 01:20 PM:
Spam link removed
Also, please, translate your post in English
Reply
#12
(Jan-10-2025, 08:19 AM)perfringo Wrote:
(Jan-09-2025, 09:26 PM)Anorak Wrote: Where are the matches being output?

There is no print in this code so nothing is outputted. However, anagrams found by this code are stored in helpfully named anagrams. You can print them out or do whatever you want with them.

Doh! Of course, I'm really showing how little I've absorbed the last two weeks. I definitely covered print and I can't believe I didn't think to try that. I think it's going to be a bit of a learning curve. Wall
Reply
#13
Quote:Does it mean that "anagrams" are not from consecutive characters and you look for them in the characters of the whole row/column?

I decided to find "classic anagrams" (i.e. sequence of continuous characters) with brute force. For that:

- I grabbed initial data
- downloaded list of english words from Github (containing 370 104 words)
- made indices starting from length of two (i.e ignoring one letter words) for row length of 20 characters for making slices
- made slices using indices and made sorted strings out of them (for comparison later)
- read data from file to dictionary where key is string with sorted characters and values are words which can be made from key
- iterated over strings made from slices and and added values from dictionary if matched with key

Actual code has probably less characters than the description of it. Didn't do any tests so no guarantees that result is correct. Just to demonstrate one possible brute force approach (only for rows):

edit2: Never mind my previous question, I figured it out with the help of AI. Here's where I'm at (with some help from AI *embarrassed emoji*:

[inline]data = ("qvucalilenineteenbma", "cretehtreaoucsifevle", "oroundhnimihstoliaro",
"uhtronroeocehsrardwo", "noubtdeloeltlntretaw", "tetatseonyneoaolineo",
"pmuehttsewodahreeret", "eonskaeptoooilpasiec", "lesdbnooraooevlewtsr",
"entufivetntncoiujtso", "vrniemaklawescodtrid", "eadtteyossquareeerht",
"nmtopleceohetquarryo", "nsiosmixzfdqbccasliy", "apsorezngqzpsapphire")
# Assign aliases row1, row2, ..., row15
aliases = {f"row{i + 1}": string for i, string in enumerate(data)}

# Generate slices by sorting substrings of each row
indices = [(start, end) for start in range(20) for end in range(start + 2, 21)]
slices = [(alias, ''.join(sorted(row[begin:stop])))
for alias, row in aliases.items()
for (begin, stop) in indices]

# Mapping anagrams based on words_rpo.txt file
mapping = dict()
with open("words_rpo.txt", "r") as f:
for row in f:
word = row.strip()
value = ''.join(sorted(word))
mapping.setdefault(value, set()).add(word)

# Find matched anagrams with their aliases
anagrams_with_aliases = {}
for alias, sorted_slice in slices:
if sorted_slice in mapping:
for word in mapping[sorted_slice]:
if word not in anagrams_with_aliases:
anagrams_with_aliases[word] = set()
anagrams_with_aliases[word].add(alias)

# Prepare sorted output with aliases
sorted_anagrams = sorted(anagrams_with_aliases.items(), key=lambda x: len(x[0]))

# Print the results: Each anagram followed by its aliases
for anagram, alias_set in sorted_anagrams:
print(f"Anagram: {anagram}, Found in rows: {', '.join(sorted(alias_set))}")[/inline]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Need an alternative to brute force optimization loop jmbonni 5 3,027 Jun-05-2024, 02:21 AM
Last Post: stevejobb
  Forcing matplotlib to NOT use scientific notation when graphing sawtooth500 4 4,765 Mar-25-2024, 03:00 AM
Last Post: sawtooth500
  Solving an equation by brute force within a range alexfrol86 3 4,132 Aug-09-2022, 09:44 AM
Last Post: Gribouillis
Question Numeric Anagrams - Count Occurances monty024 2 2,203 Nov-13-2021, 05:05 PM
Last Post: monty024
  How to use scipy.optimization.brute for multivariable function Shiladitya 9 8,723 Oct-28-2020, 10:40 PM
Last Post: scidam
  I need advise with developing a brute forcing script fatjuicypython 11 8,258 Aug-21-2020, 09:20 PM
Last Post: Marbelous
  Forcing input from pre-defined list. scotty501 11 8,060 Jun-18-2019, 01:49 PM
Last Post: scotty501
  Why doesn't gc delete an object without forcing a garbage collection call? AlekseyPython 5 4,922 Mar-19-2019, 02:10 AM
Last Post: micseydel
  Password Brute Force 2skywalkers 9 7,247 Oct-18-2018, 02:35 PM
Last Post: buran
  Brute Force Password Guesser 2skywalkers 1 3,889 Oct-05-2018, 08:04 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020