Jump to content











Photo
- - - - -

Maximum number of files inside a single folder? (EXT3 FS)


  • Please log in to reply
12 replies to this topic

#1 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 28 December 2010 - 12:19 AM

Hello,

I'm running an experiment that holds several millions of results.

In the past, I've tried to store these values on a database but the performance was decreasing too fast as the number of records increased.

So, I've reverted to a simpler solution and instead of using a database I've decided to store each record as an individual file.


Things seemed to be going well but what is the limit for the number of files that one can hold inside a single folder?

I'm running Ubuntu x64 on a EXT3 file system. Looking at wikipedia I see:

The maximum number of inodes (and hence the maximum number of files and directories) is set when the file system is created. If V is the volume size in bytes, then the default number of inodes is given by V/(2^13) (or the number of blocks, whichever is less), and the minimum by V/(2^23). The default was deemed sufficient for most applications. The max number of subdirectories in one directory is fixed to 32000.

http://en.wikipedia.org/wiki/Ext3

So, to get the number of blocks I've looked on the instructions from this page:
http://www.howtoadvice.com/Ext3Max

I've ran the code below to discover the number of available blocks:
sudo /sbin/dumpe2fs /dev/sdb1 | grep "Block size"

The result is 4096

The volume size (V) is 1 447 337 552 and 2^13 = 8192 to use in the formula.

When doing 1447337552/(2^13) the result is roughly 176 676 but I see from the report on the file manager that more files are already inside. If I use 4096 (following the rule of lower blocks), the max result should be 353 353.

I'm unable of accessing the folder from the Gnome shell (can only see the tip when selecting the folder on the footer of the window) with an indication of 1 436 916 files on the sub folder.

---------------------

Is it correct to assume that the max number of files on this case is between 176 and 353 thousand of files per folder on this case?

Thanks.

#2 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 30 December 2010 - 12:47 PM

Update.

I've got a helpful reply by personal message and in the meanwhile decided to use a different approach.

----------

However, I've created a test folder to hold these files inside the EXT3 and now I'm unable to remove it from the file system.

Any tips how one can do such thing?

Trying rm -rf ./test doesn't work (it gets stuck), trying to delete from the file manager hold similar freeze.

I can't even do a ls command inside the subfolder to check how many files are there.

---

Does anyone know another way of removing this folder? (that doesn't involve formatting the drive)

Thank you.

#3 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 30 December 2010 - 02:05 PM

You can try "ls -f" to view the files (disables sorting and displaying the filesize of each file)

You can try this:
find ./test  -exec rm -f {} \;


#4 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 30 December 2010 - 07:24 PM

Hi Icecube,

Trying "ls -f" doesn't yield visible results and just gets stuck.

You second suggestion:
find ./test  -exec rm -f {} \;
Is very interesting but it also hangs the command line, only complaining that ./test is a directory and then stuck as any other command.

I guess we've uncovered a bug on the EXT3 file system.

#5 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 31 December 2010 - 02:53 PM

I guess we've uncovered a bug on the EXT3 file system.


I guess we also discovered a much more severe bug in your line of reasoning....:cheers:

Doesn't this imply :whistling::

Does anyone know another way of removing this folder? (that doesn't involve formatting the drive)

that you are carrying dangerous experiments on a production machine/disk for which you do not have a backup/image handy? ;)

:cheers:
Wonko

#6 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 31 December 2010 - 04:21 PM

that you are carrying dangerous experiments on a production machine/disk for which you do not have a backup/image handy?

Come on.. where is your sense of adventure? :wheelchair:

The Gnome file manager is actually "preparing" to delete the folder and shows a body count of 6 million while reading all files prior to delete them.

If the algorithm went all the way until the end when it was running, then around 12 million files should be placed inside the folder.

Are we setting a Guiness world record for the max number of files inside a single folder?

:)

#7 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 31 December 2010 - 04:36 PM

Come on.. where is your sense of adventure? :wheelchair:

"Adventure" is dealing with the UNexpected.

JFYI:

Adventure is just bad planning.



More here:
http://en.wikipedia..../Roald_Amundsen

I may say that this is the greatest factor -- the way in which the expedition is equipped -- the way in which every difficulty is foreseen, and precautions taken for meeting or avoiding it. Victory awaits him who has everything in order -- luck, people call it. Defeat is certain for him who has neglected to take the necessary precautions in time; this is called bad luck.


In any case, Murphy's Law RULEZ :)

Just in case, you were given a good :unsure: suggestion some time ago:
http://en.wikipedia....amming_language)
that may apply to your current problem too.


:cheers:
Wonko

#8 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 31 December 2010 - 05:26 PM

Just in case, you were given a good suggestion some time ago:
http://en.wikipedia....amming_language)
that may apply to your current problem too.

Thanks but Java is my weapon of choice for nowadays.


Adventure is just bad planning.

Has Mr Roald Amundsen ever dealt with millions of files before? :wheelchair: I say this because Murphy has surely observed some neat principles about software, such as:

If you perceive that there are four possible ways in which a procedure can go wrong, and circumvent these, then a fifth way, unprepared for, will promptly develop.

http://quotations.about.com/cs/murphyslaws/a/bls_murphys_law.htm

#9 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 31 December 2010 - 05:52 PM

I say this because Murphy has surely observed some neat principles about software, such as:
http://quotations.ab...murphys_law.htm

You missed the implied contradiction in the (unfortunately very sad) epilogue of Roald Amundsen's life (notwithstanding his obsessive attention to foresee events and be prepared to them) he died in an airplane accident while attempting rescueing other North Pole explorers.

Murphy's Law(s) are ALWAYS hanging around trying to prove being rightful once again. :wheelchair:

For the particular topic I would have rather chosen:
http://www.murphys-l...technology.html

A computer makes as many mistakes in two seconds as 20 men working 20 years make.


and

Computers are unreliable, but humans are even more unreliable. Any system which depends on human reliability is unreliable.


and

if it works in theory, it won't work in practice.
if it works in practice it won't work in theory.


:unsure:

:)
Wonko

#10 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 01 January 2011 - 10:14 PM

You second suggestion:

find ./test  -exec rm -f {} \;
Is very interesting but it also hangs the command line, only complaining that ./test is a directory and then stuck as any other command.

Are you sure it got stuck?
Deleting that much files can take a while.
The command will print an error message when it can't delete a directory, because first all files inside this dir need to be removed first, but it still continues to delete the next file.

#11 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 01 January 2011 - 10:42 PM

You're right, this might indeed take a few hours or days.

Right now I'm repeating the delete operation using the Nautilus file manager so that I can also see how many files are inside the folder.

If that doesn't work then I'll repeat the command you suggest and wait, wait and wait until we see it complete.

#12 sbaeder

sbaeder

    Gold Member

  • .script developer
  • 1338 posts
  • Location:usa - massachusettes
  •  
    United States

Posted 02 January 2011 - 04:54 AM

Any tips how one can do such thing?

Trying rm -rf ./test doesn't work (it gets stuck), trying to delete from the file manager hold similar freeze.

I can't even do a ls command inside the subfolder to check how many files are there.

---

Does anyone know another way of removing this folder? (that doesn't involve formatting the drive)

If you know anything about the patterns of the file names, you can always do things like
rm *11
which would limit the shell to just files matching the pattern, but then again, for 10+Million files??? Who knows.

I like the "find and delete" option the best, but it could take a LONG LONG TIME...even at 100 files per second, that's on the order of 100K seconds (over 27 hours), and not sure it would go that fast (on average) so it could be 2x or more longer...

At some point, a backup of the other files on the FS and a reformat might be a better option...

:thumbsup:

#13 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 02 January 2011 - 01:43 PM

Yes, I agree.

Right now time is not an issue and I can wait as long as necessary.

Looking at the file manager, it has now found over 4 million files and counting. Perhaps another 24 hours until all files are found and yet another much time to delete them all.

A world guiness record.. :thumbsup:




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users