Python Forum
Merge htm files with shutil library (TypeError: 'module' object is not callable)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Merge htm files with shutil library (TypeError: 'module' object is not callable)
#1
why is my code not working?

import shutil

filenames = shutil('*.htm')  # list of all .htm files in the directory

with open('output_file.htm','wb') as wfd:
    for f in filenames:
        with open(f,'rb') as fd:
            shutil.copyfileobj(fd, wfd)
I get this error:

Error:
Traceback (most recent call last): File "E:\Carte\BB\17 - Site Leadership\alte\Ionel Balauta\Aryeht\Task 1 - Traduce tot site-ul\Doar Google Web\Andreea\Meditatii\Sedinta 31 august 2022\merge txt - versiune 2 .py", line 3, in <module> filenames = shutil('*.htm') # list of all .htm files in the directory TypeError: 'module' object is not callable
Reply
#2
Use
import glob
filenames = glob.glob('*.htm')
Reply
#3
(Aug-28-2022, 07:04 AM)Gribouillis Wrote: Use
import glob
filenames = glob.glob('*.htm')

yes, with globe works. But I still don't understand why does shutil is not working. ?!
Reply
#4
(Aug-28-2022, 07:07 AM)Melcu54 Wrote: But I still don't understand why does shutil is not working. ?!
Because shutil is a module. It is not a function that returns a list of files.
ndc85430 likes this post
Reply
#5
Line 3: why are you trying to call the shutil module as a function?
Reply
#6
(Aug-28-2022, 07:10 AM)ndc85430 Wrote: Line 3: why are you trying to call the shutil module as a function?

ok, now it works. Thanks a lot !

import shutil
import os
import glob

import glob
filenames = glob.glob('*.htm')

with open('output_file.htm','wb') as wfd:
    for f in filenames:
        with open(f,'rb') as fd:
            shutil.copyfileobj(fd, wfd)
Reply
#7
If you run the program twice, then also the content of "ouput_file.htm" is included to "output_file.htm", but "output_file.htm" gets new content, because the output is read and written from the same file. I tried this on a NVMe and file size was growing very fast. I deleted afterward 4 GiB data because of this silly mistake.

Example with Path objects
from pathlib import Path
from shutil import copyfileobj


def merge(path: str|Path, glob: str, output: str|Path, show:bool=False) -> None:
    """
    Merge files found in path by glob pattern.
    All data is written to output.

    Args:
        path (str | Path): Path to find files
        glob (str): glob pattern to find files in path
        output (str | Path): Output file
        show (bool, optional): Print processed file. Defaults to False.
    """
    # Ensure that output is a Path object
    output = Path(output)
    
    # excluding output file from the list of files
    files = [file for file in Path(path).glob(glob) if file != output]
    
    # keep in mind, that the order of files is not given
    # sorting files by modification time
    # but I guess it's not what you want
    files.sort(key=lambda file: file.stat().st_mtime)

    if not files:
        return

    with open(output, "wb") as fd_out:
        for file in files:
            if show:
                print(f"Processing {file}")

            with file.open("rb") as fd_in:
                copyfileobj(fd_in, fd_out)


merge(".", "*.txt", "output_file.txt", show=True)
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#8
htm or html files are text files.

I don't have any .htm files, but I do have .html files!

Why on earth you might want to shove all the htm* files together in 1 text file, I have no idea! How does that help? Why not put them all in a zip file?

But you could do it like this, just using Path from pathlib:

from pathlib import Path

destination = Path('/home/pedro/temp/output.txt')
source = Path('/var/www/html/')

files_list = sorted(source.glob('*.html'))

with destination.open(mode='a') as d:    
    for file in files_list:        
        htm = source / file # looks like PosixPath('/var/www/html/index.html')
        with htm.open() as f:
            html = f.read()
            d.write(html)
I found d.write_text(html) did not work, not sure why, path lib docs seem to think it should work:

Quote:Traceback (most recent call last):
File "<pyshell#19>", line 6, in <module>
d.write_text(html)
AttributeError: '_io.TextIOWrapper' object has no attribute 'write_text'
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  I'm trying to merge 2 .csv files with no joy! Sick_Stigma 3 829 Aug-03-2024, 03:20 PM
Last Post: mariadsouza362
  I am getting this TypeError: 'TreasureMap' object is not subscriptable. makilakos 2 1,101 May-25-2024, 07:58 PM
Last Post: deanhystad
  Using zipfile module - finding folders not files darter1010 2 1,733 Apr-06-2024, 07:22 AM
Last Post: Pedroski55
  TypeError: cannot pickle ‘_asyncio.Future’ object Abdul_Rafey 1 2,395 Mar-07-2024, 03:40 PM
Last Post: deanhystad
  error in class: TypeError: 'str' object is not callable akbarza 2 1,543 Dec-30-2023, 04:35 PM
Last Post: deanhystad
  use of shutil.copytree with ENOTDIR exception yan 2 2,360 Nov-29-2023, 03:02 PM
Last Post: yan
Bug TypeError: 'NoneType' object is not subscriptable TheLummen 4 3,044 Nov-27-2023, 11:34 AM
Last Post: TheLummen
  merge all xlsb files into csv mg24 0 746 Nov-13-2023, 08:25 AM
Last Post: mg24
  TypeError: 'NoneType' object is not callable akbarza 4 12,164 Aug-24-2023, 05:14 PM
Last Post: snippsat
  [NEW CODER] TypeError: Object is not callable iwantyoursec 5 4,905 Aug-23-2023, 06:21 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020