Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Excel Question
#11
Ah, ok. I'm used to the testyourmight forums where a uder won't be notified you responded unless you quote or tag them. Thanks for the heads up!
Reply
#12
(Jan-05-2018, 09:35 PM)karaokelove Wrote: Is there a good way to reply or tag someone without quoting their entire message?
There is a Quote highlighted text button to the left of Quote button.
(Jan-05-2018, 05:45 PM)karaokelove Wrote: For instance, we have a lot of "If the model is a [car-model], what is the make?". Basically, questions that are totally different, but contain 90%+ similar wording. A simple percent-based comparison algorithm would flag these as duplicates and delete usable questions.
You may need to read the Excel file and to compare stuff so you get what you want,if @Gribouillis method can be used is for sure fast.
The binary Excel format may not make much sense before read it.
Python has several good libraries for this eg Pandas, openpyxl, pyexcel.

I like Pandas because when read in a file format,Pandas has a lot power and several way to remove duplicates.
Example email_excel.xlsx,there is two duplicate emails.
Output:
name message email John Smith Hello Mr. Smith [email protected] John Doe Hello Mr. Doe [email protected] Ms. Foo Hello Ms. Foo [email protected]
Fire up Pandas and fix it.
G:\Anaconda3
λ python -m ptpython
>>> import pandas as pd

>>> df = pd.read_excel('email_excel.xlsx', sheetname=0)
>>> df.head()
         Name          message            email
0  John Smith  Hello Mr. Smith  [email protected]
1    John Doe    Hello Mr. Doe  [email protected]
2     Ms. Foo    Hello Ms. Foo  [email protected]


>>> remove_dup = df[~df.stack().duplicated().unstack().any(1)]
>>> remove_dup
         Name          message            email
0  John Smith  Hello Mr. Smith  [email protected]
1    John Doe    Hello Mr. Doe  [email protected]

>>> remove_dup.to_excel('email_nodup.xlsx')
No have this in email_nodup.xlsx.
Output:
Name message email John Smith Hello Mr. Smith [email protected] John Doe Hello Mr. Doe [email protected]
Reply
#13
(Jan-05-2018, 11:12 PM)snippsat Wrote: There is a Quote highlighted text to the right of Quote button.
Wow, thanks for all the useful information! I'll be sure to check those out.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020