Discussion:
tar zeroes out part of file.
(too old to reply)
Siri Cruise
2020-11-29 22:40:58 UTC
Permalink
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Keith Thompson
2020-11-30 00:03:11 UTC
Permalink
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?

I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Siri Cruise
2020-11-30 00:48:17 UTC
Permalink
Post by Keith Thompson
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?
Because I invoke od to use octal addresses.
Post by Keith Thompson
I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
I used od to dump the source and tar-copied file. The source has
random numbers throughout. The copy is zeroed until 040000000 and
then matches the original.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Gary R. Schmidt
2020-11-30 04:29:44 UTC
Permalink
Post by Siri Cruise
Post by Keith Thompson
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?
Because I invoke od to use octal addresses.
Post by Keith Thompson
I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
I used od to dump the source and tar-copied file. The source has
random numbers throughout. The copy is zeroed until 040000000 and
then matches the original.
Have you checked that the files aren't "holey"? As in they are only
written to from your mentioned offset.

Cheers,
Gary B-)
--
Waiting for a new signature to suggest itself...
Siri Cruise
2020-11-30 06:07:25 UTC
Permalink
Post by Gary R. Schmidt
Post by Siri Cruise
Post by Keith Thompson
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?
Because I invoke od to use octal addresses.
Post by Keith Thompson
I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
I used od to dump the source and tar-copied file. The source has
random numbers throughout. The copy is zeroed until 040000000 and
then matches the original.
Have you checked that the files aren't "holey"? As in they are only
written to from your mentioned offset.
This is an HFS+ volume. I don't think it supports that kind of
stuff. Are there any commands to get file structure secrets?
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Paul
2020-11-30 18:31:52 UTC
Permalink
Post by Siri Cruise
Post by Gary R. Schmidt
Post by Siri Cruise
Post by Keith Thompson
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?
Because I invoke od to use octal addresses.
Post by Keith Thompson
I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
I used od to dump the source and tar-copied file. The source has
random numbers throughout. The copy is zeroed until 040000000 and
then matches the original.
Have you checked that the files aren't "holey"? As in they are only
written to from your mentioned offset.
This is an HFS+ volume. I don't think it supports that kind of
stuff. Are there any commands to get file structure secrets?
If I run a search on 8MB, I get a reference to Time Machine
using "8MB bands". 040000000 octal = 8388608 decimal = 8MIB

https://discussions.apple.com/thread/3734638

Your description doesn't make much sense from a normal
failure scenario perspective. You could try Disk First Aid on the
volume and see if it detects trouble. The only reason
I looked for such a size, was to look for any kind
of aggravating factor. HFS+ isn't a sparse supporter.

Paul
Kaz Kylheku
2020-11-30 21:11:59 UTC
Permalink
Post by Gary R. Schmidt
Post by Siri Cruise
Post by Keith Thompson
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Why do you write "040000000"? Is that supposed to be octal?
Because I invoke od to use octal addresses.
Post by Keith Thompson
I presume you've already confirmed that the files in question don't
already start with that number of zero bytes.
I used od to dump the source and tar-copied file. The source has
random numbers throughout. The copy is zeroed until 040000000 and
then matches the original.
Have you checked that the files aren't "holey"? As in they are only
written to from your mentioned offset.
But then how would the original file have random numbers throughout?

If there is a hole up to 040000000, then under ordinary I/O that appears
as a range of all-zero bytes.

GNU tar requires the --sparse or -s option for sparseness detection.
Jorgen Grahn
2020-11-30 22:23:15 UTC
Permalink
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
No, but I think I'd use strace (or your OS's equivalent) to find out
what it's actually doing. Is there a point where it writes megabytes
of \0 to fd 0? Is there a point where it reads megabytes of \0? and
so on.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Siri Cruise
2020-12-01 00:21:33 UTC
Permalink
Post by Jorgen Grahn
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
No, but I think I'd use strace (or your OS's equivalent) to find out
what it's actually doing. Is there a point where it writes megabytes
of \0 to fd 0? Is there a point where it reads megabytes of \0? and
so on.
/Jorgen
Tcl file copy works so I've written a command to do the recursive
copy. I can't use cp -R since it separates hard links. So the
command catalogs dev and ino to link outputs where inputs are
linked.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
s***@isnotyourbuddy.co.uk
2020-12-01 08:21:02 UTC
Permalink
On Mon, 30 Nov 2020 16:21:33 -0800
Post by Siri Cruise
Tcl file copy works so I've written a command to do the recursive
Tcl? There's a blast from the past. Does anyone still use that language for
anything but legacy code?
Siri Cruise
2020-12-01 09:31:02 UTC
Permalink
Post by s***@isnotyourbuddy.co.uk
On Mon, 30 Nov 2020 16:21:33 -0800
Post by Siri Cruise
Tcl file copy works so I've written a command to do the recursive
Tcl? There's a blast from the past. Does anyone still use that language for
anything but legacy code?
Because it's much more fun to rewrite all your scripts every few
years when the kids find a hot new toy.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
s***@isnotyourbuddy.co.uk
2020-12-01 11:06:37 UTC
Permalink
On Tue, 01 Dec 2020 01:31:02 -0800
Post by Siri Cruise
Post by s***@isnotyourbuddy.co.uk
On Mon, 30 Nov 2020 16:21:33 -0800
Post by Siri Cruise
Tcl file copy works so I've written a command to do the recursive
Tcl? There's a blast from the past. Does anyone still use that language for
anything but legacy code?
Because it's much more fun to rewrite all your scripts every few
years when the kids find a hot new toy.
Thats why I said legacy. I did a lot of Tcl and Tk in the 90s. An utterly
horrid language with a hopelessly inconsistent and unclear syntax that
couldn't decide if it was shell script, C or Basic.
Ralf Fassel
2020-12-01 10:51:01 UTC
Permalink
* ***@isnotyourbuddy.co.uk
| On Mon, 30 Nov 2020 16:21:33 -0800
| Siri Cruise <***@yahoo.com> wrote:
| >Tcl file copy works so I've written a command to do the recursive
| Tcl? There's a blast from the past. Does anyone still use that
| language for anything but legacy code?

Yes.

R'
Ralf Fassel
2020-12-01 10:55:09 UTC
Permalink
* Siri Cruise <***@yahoo.com>
| In article <slrnrsas6j.2ptb.grahn+***@frailea.sa.invalid>,
| Jorgen Grahn <grahn+***@snipabacken.se> wrote:
| > On Sun, 2020-11-29, Siri Cruise wrote:
| > > I'm doing
| > > (cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)

Note that if the first cd fails, this will pick up the wrong files.
Error messages are easy to overlook.

Better
(cd dir1 && tar -c -f - *SO ...
Or even
tar -C dir1 -c -f - ...

| > No, but I think I'd use strace (or your OS's equivalent) to find out
| > what it's actually doing. Is there a point where it writes megabytes
| > of \0 to fd 0? Is there a point where it reads megabytes of \0? and
| > so on.
| >
| > /Jorgen
| Tcl file copy works so I've written a command to do the recursive
| copy. I can't use cp -R since it separates hard links. So the
| command catalogs dev and ino to link outputs where inputs are
| linked.

Maybe try rsync?

R'
Siri Cruise
2020-12-01 12:21:55 UTC
Permalink
Post by Ralf Fassel
Maybe try rsync?
I got something this works now. I'll have to leave this as
another unexplained Apple screw up.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Richard Kettlewell
2020-12-01 09:34:04 UTC
Permalink
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Do any of the filenames matched by *SO* start with a dash?

I don’t know of any tar options that would produce the behavior you see,
but we don’t have the exact command executed yet.
--
https://www.greenend.org.uk/rjk/
Siri Cruise
2020-12-01 09:36:19 UTC
Permalink
Post by Richard Kettlewell
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Do any of the filenames matched by *SO* start with a dash?
They're all lib*.so shared libraries.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Keith Thompson
2020-12-01 10:29:14 UTC
Permalink
Post by Siri Cruise
Post by Richard Kettlewell
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Do any of the filenames matched by *SO* start with a dash?
They're all lib*.so shared libraries.
*.so shared libraries almost always have names containing ".so", not
".SO". The point is that, assuming they don't actually have "SO" in
their names, you haven't shown us the actual command, and we can't guess
what other details you've omitted.

Why not "lib*.so*"? "*so*" could easily match unrelated files, like
"foo.json" or "sort.txt".

Please show us the exact command and a listing of the matching files.
If possible, narrow it down to a small set of files that reproduce the
problem.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Siri Cruise
2020-12-01 12:24:45 UTC
Permalink
Post by Keith Thompson
Post by Siri Cruise
Post by Richard Kettlewell
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Do any of the filenames matched by *SO* start with a dash?
They're all lib*.so shared libraries.
*.so shared libraries almost always have names containing ".so", not
".SO". The point is that, assuming they don't actually have "SO" in
their names, you haven't shown us the actual command, and we can't guess
what other details you've omitted.
Why not "lib*.so*"? "*so*" could easily match unrelated files, like
"foo.json" or "sort.txt".
Please show us the exact command and a listing of the matching files.
If possible, narrow it down to a small set of files that reproduce the
problem.
I did tar cf | tar xf for one of the specific files that got
zonked. I'm going to eventually replace these piped tars with my
own script where I can control every detail.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed
Keith Thompson
2020-12-01 18:12:20 UTC
Permalink
Post by Siri Cruise
Post by Keith Thompson
Post by Siri Cruise
Post by Richard Kettlewell
Post by Siri Cruise
I'm doing
(cd dir1; tar -c -f - *SO* | tar -x -C dir2 -f -)
and for unknown reason it zeroes out the first 040000000 bytes of
some files. Any clues?
Do any of the filenames matched by *SO* start with a dash?
They're all lib*.so shared libraries.
*.so shared libraries almost always have names containing ".so", not
".SO". The point is that, assuming they don't actually have "SO" in
their names, you haven't shown us the actual command, and we can't guess
what other details you've omitted.
Why not "lib*.so*"? "*so*" could easily match unrelated files, like
"foo.json" or "sort.txt".
Please show us the exact command and a listing of the matching files.
If possible, narrow it down to a small set of files that reproduce the
problem.
I did tar cf | tar xf for one of the specific files that got
zonked. I'm going to eventually replace these piped tars with my
own script where I can control every detail.
So you're not going to give us enough information to diagnose the
problem.

There could be a problem with tar on your system, and others could run
into it. Please consider constructing a reproducible test case.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
Continue reading on narkive:
Loading...