for a project we need to export some shapes as a .dxf file using the php DXFwriter (https://github.com/digitalfotografen/DXFwriter) which sadly does not include ellipses. We've used polylines instead by now but with hundreds of single points it was not good for our purpose.
We now wanted to use the ellipse entity but if we just add a ellipse to our entities section AutoCAD is not able to open the .dxf file. Do we have to add some lines to one of the other sections to bring ellipses to work or do you have some other ideas how to solve this problem?
The entities section:
0
SECTION
2
ENTITIES
0
ELLIPSE
5
262
330
1F
100
AcDbEntity
8
0
100
AcDbEllipse
10
1927.933413526791
20
2355.552659681358
30
0.0
11
1694.611795869434
21
-112.6281645577583
31
0.0
210
0.0
220
0.0
230
1.0
40
0.2345744769758316
41
0.0
42
6.283185307179586
0
ENDSEC
Greetings
Joe
Solution:
In the end we've decided to write our own DXF export library which is able to export valid R13 DXF files. It's open source so if anyone will have similar problems maybe https://github.com/enjoping/DXFighter is something for you.
A DXF file with only an ENTITIES section is considered by AutoCAD to be a R12 format file and cannot contains entity type added after this release, like ELLIPSE and LWPOLYLINE. You cannot omit other sections, because for R13 and newest files, there is an audit step which check if the file is valid.
From my experiments, it seems to be very difficult to build a valid post R12 DXF file.
Related
Firstly, my Java version:
string str = "helloworld";
ByteArrayOutputStream localByteArrayOutputStream = new ByteArrayOutputStream(str.length());
GZIPOutputStream localGZIPOutputStream = new GZIPOutputStream(localByteArrayOutputStream);
localGZIPOutputStream.write(str.getBytes("UTF-8"));
localGZIPOutputStream.close();
localByteArrayOutputStream.close();
for(int i = 0;i < localByteArrayOutputStream.toByteArray().length;i ++){
System.out.println(localByteArrayOutputStream.toByteArray()[i]);
}
and output is:
31
-117
8
0
0
0
0
0
0
0
-53
72
-51
-55
-55
47
-49
47
-54
73
1
0
-83
32
-21
-7
10
0
0
0
Then the Go version:
var gzBf bytes.Buffer
gzSizeBf := bufio.NewWriterSize(&gzBf, len(str))
gz := gzip.NewWriter(gzSizeBf)
gz.Write([]byte(str))
gz.Flush()
gz.Close()
gzSizeBf.Flush()
GB := (&gzBf).Bytes()
for i := 0; i < len(GB); i++ {
fmt.Println(GB[i])
}
output:
31
139
8
0
0
9
110
136
0
255
202
72
205
201
201
47
207
47
202
73
1
0
0
0
255
255
1
0
0
255
255
173
32
235
249
10
0
0
0
Why?
I thought it might be caused by different byte reading methods of those two languages at first. But I noticed that 0 can never convert to 9. And the sizes of []byte are different.
Have I written wrong code? Is there any way to make my Go program get the same output as the Java program?
Thanks!
First thing is that the byte type in Java is signed, it has a range of -128..127, while in Go byte is an alias of uint8 and has a range of 0..255. So if you want to compare the results, you have to shift negative Java values by 256 (add 256).
Tip: To display a Java byte value in an unsigned fashion, use: byteValue & 0xff which converts it to int using the 8 bits of the byte as the lowest 8 bits in the int. Or better: display both results in hex form so you don't have to care about sign-ness...
Even if you do the shift, you will still see different results. That might be due to different default compression level in the different languages. Note that although the default compression level is 6 in both Java and Go, this is not specified and different implementations are allowed to choose different values, and it might also change in future releases.
And even if the compression level would be the same, you might still encounter differences because gzip is based on LZ77 and Huffman coding which uses a tree built on frequency (probability) to decide the output codes and if different input characters or bit patterns have the same frequency, assigned codes might vary between them, and moreover multiple output bit patterns might have the same length and therefore a different one might be chosen.
If you want the same output, the only way would be (see notes below!) to use the 0 compression level (not to compress at all). In Go use the compression level gzip.NoCompression and in Java use the Deflater.NO_COPMRESSION.
Java:
GZIPOutputStream gzip = new GZIPOutputStream(localByteArrayOutputStream) {
{
def.setLevel(Deflater.NO_COMPRESSION);
}
};
Go:
gz, err := gzip.NewWriterLevel(gzSizeBf, gzip.NoCompression)
But I wouldn't worry about the different outputs. Gzip is a standard, even if outputs are not the same, you will still be able to decompress the output with any gzip decoders whichever was used to compress the data, and the decoded data will be exactly the same.
Here are the simplified, extended versions:
Not that it matters, but your codes are unneccessarily complex. You could simplify them like this (these versions also include setting 0 compression level and converting negative Java byte values):
Java version:
ByteArrayOutputStream buf = new ByteArrayOutputStream();
GZIPOutputStream gz = new GZIPOutputStream(buf) {
{ def.setLevel(Deflater.NO_COMPRESSION); }
};
gz.write("helloworld".getBytes("UTF-8"));
gz.close();
for (byte b : buf.toByteArray())
System.out.print((b & 0xff) + " ");
Go version:
var buf bytes.Buffer
gz, _ := gzip.NewWriterLevel(&buf, gzip.NoCompression)
gz.Write([]byte("helloworld"))
gz.Close()
fmt.Println(buf.Bytes())
NOTES:
The gzip format allows some extra fields (headers) to be included in the output.
In Go these are represented by the gzip.Header type:
type Header struct {
Comment string // comment
Extra []byte // "extra data"
ModTime time.Time // modification time
Name string // file name
OS byte // operating system type
}
And it is accessible via the Writer.Header struct field. Go sets and inserts them, while Java does not (leaves header fields zero). So even if you set compression level to 0 in both languages, the output will not be the same (but the "compressed" data will match in both outputs).
Unfortunately the standard Java does not provide a way/interface to set/add these fields, and Go does not make it optional to fill the Header fields in the output, so you will not be able to generate exact outputs.
An option would be to use a 3rd party GZip library for Java which supports setting these fields. Apache Commons Compress is such an example, it contains a GzipCompressorOutputStream class which has a constructor which allows a GzipParameters instance to be passed. This GzipParameters is the equvivalent of the gzip.Header structure. Only using this would you be able to generate exact output.
But as mentioned, generating exact output has no real-life value.
From RFC 1952, the GZip file header is structured as:
+---+---+---+---+---+---+---+---+---+---+
|ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
+---+---+---+---+---+---+---+---+---+---+
Looking at the output you've provided, we have:
| Java | Go
ID1 | 31 | 31
ID2 | 139 | 139
CM (compression method) | 8 | 8
FLG (flags) | 0 | 0
MTIME (modification time) | 0 0 0 0 | 0 9 110 136
XFL (extra flags) | 0 | 0
OS (operating system) | 0 | 255
So we can see that Go is setting the modification time field of the header, and setting the operating system to 255 (unknown) rather than 0 (FAT file system). In other respects they indicate that the file is compressed in the same way.
In general these sorts of differences are harmless. If you want to determine if two compressed files are the same, then you should really compare the decompressed versions of the files though.
I have a incoming stream that is compressed using the zlib functions, but I cannot tell the ending of the compressed data, so am having a lot of trouble getting the data out.
I also have a snippet of the source code where it is being uncompressed in AS3 flash, which should have been enough for me to figure it out, but I am at a loss.
Two files included:
http://falazar.com/projects/irnfll/version4/test//stackoverflow/as3_code.as.txt
http://falazar.com/projects/irnfll/version4/test//stackoverflow/bin_data_file
Snippet of the binary data, and what I know:
00 00 02 34 2c 02 31 78 5e ed dc cd 6e da 40 a5
21 19 40 f5 f2 c4 b7 e9 18 85 e1 5b 89 66 3d 42
31 95 90 cd 15 74 99 55 37 51 14 59 c9 a8 8c 54
0234 appears to be a size marker - 564
2c = 44, the code to match as3 COMMAND_WORLD_DATA, that is ok
0231 another size marker, always 3 smaller than above
785e - is the header marker for the zlib compress weakest compression, ZLIB_ENCODING_DEFLATE level 4
Later there is also a 789c which is another larger compressed block
I need to uncompress the two of these to move forward in project, thank you for your help.
There is also mention in the script of bigendian conversion, and am I not sure if I need to handle that.
I have written a couple scripts to try and solve this, including a php snippet that chops off the end until and loops trying to uncompress with no luck.
falazar.com/projects/irnfll/version4/test//stackoverflow/php_test.php.txt
Ideal solution in php or c#, but anything I can see that works will translate into another language easy enough.
(Using Free hex editor nero to view the binary)
You mean zlib.
Use PHP's gzuncompress() starting at each zlib header (e.g. 789c).
I need to scan an uploaded PDF to determine if the pages within are all portrait or if there are any landscape pages. Is there someway I can use PHP or a linux command to scan the PDF for these pages?
(Updated answer -- scroll down...)
You can use either pdfinfo (part of either the poppler-utils or the xpdf-tools) or identify (part of the ImageMagick toolkit).
identify:
identify -format "%f Page %s: Width: %W -- Height: %H\n" T-VD7.PDF
Example output:
T-VD7.PDF Page 0: Width: 595 -- Height: 842
T-VD7.PDF Page 1: Width: 595 -- Height: 842
T-VD7.PDF Page 2: Width: 1191 -- Height: 842
[...]
T-VD7.PDF Page 11: Width: 595 -- Height: 421
T-VD7.PDF Page 12: Width: 595 -- Height: 842
Or a bit simpler:
identify -format "%s: %Wx%H\n" T-VD7.PDF
gives:
0: 595x842
1: 595x842
2: 1191x842
[...]
11: 595x421
12: 595x842
Note, how identify uses a zero-based page counting mechanism!
Pages are 'landscape' if their width is bigger than their height. They are neither-nor, if both are equal.
The advantage is that identify lets you tweak the output format quite easily and very extensively.
pdfinfo:
pdfinfo input.pdf | grep "Page.*size:"
Example output:
Page size: 595.276 x 841.89 pts (A4)
pdfinfo is definitely faster and also more precise than identify, if it comes to multi-page PDFs. (The 13-page PDF I tested this with took identify to 31 seconds to process, whereas pdfinfo needed less than half a second....)
Be warned: by default pdfinfo does report the size for the first page only. To get sizes for all pages (as you may know, there are PDFs which use mixed page sizes as well as mixed orientations), you have to modify the command:
pdfinfo -f 3 -l 13 input.pdf | grep "Page.*size:"
Output now:
Page 1 size: 595.276 x 841.89 pts (A4)
Page 2 size: 595.276 x 841.89 pts (A4)
Page 3 size: 1191 x 842 pts (A3)
[....]
Page 12 size: 595 x 421 pts (A5)
Page 13 size: 595.276 x 841.89 pts (A4)
This will print the sizes of page 3 (f irst to report) through page 13 (l ast to report).
Scripting it:
pdfinfo \
-f 1 \
-l 1000 \
Vergleich-VD7.PDF \
| grep "Page.* size:" \
| \
| while read Page _pageno size _width x _height rest; do
[ "$(echo "${_width} / 1"|bc)" -gt "$(echo "${_height} / 1"|bc)" ] \
&& echo "Page $_pageno is landscape..." \
|| echo "Page $_pageno is portrait..." ; \
done
(the bc-trick is required because the -gt comparison works for the shell only with integers. Dividing by 1 with bc will take round the possible real values to integers...)
Result:
Page 1 is portrait...
Page 2 is portrait...
Page 3 is landscape...
[...]
Page 12 is landscape...
Page 13 is portrait...
Update: Using the 'right' pdfinfo to discover page rotations...
My initial answer tooted the horn of pdfinfo. Serenade X says in a comment that his/her problem is to discover rotated pages.
Ok now, here is some additional info which is not yet known widely and therefor has not yet been really absorbed by all pdfinfo users...
As I mentioned, there are two different pdfinfo utilities around:
the one which comes as part of the xpdf-utils package (on some platform also named xpdf-tools).
the one which comes as part of the poppler-utils package (on some platforms also named poppler-tools, and sometimes it is not separated out as a packages but is part of the main poppler package).
Poppler's pdfinfo output
So here is a sample output from Poppler's pdfinfo command. The tested file is a 2-page PDF where the first page is in portrait A4 and the second page is in landscape A4 format:
kp#mbp:~$ pdfinfo -f 1 -l 2 a4portrait+landscape.pdf
Producer: GPL Ghostscript 9.05
CreationDate: Thu Jul 26 14:23:31 2012
ModDate: Thu Jul 26 14:23:31 2012
Tagged: no
Form: none
Pages: 2
Encrypted: no
Page 1 size: 595 x 842 pts (A4)
Page 1 rot: 0
Page 2 size: 842 x 595 pts (A4)
Page 2 rot: 0
File size: 3100 bytes
Optimized: no
PDF version: 1.4
Do you see the lines saying Page 1 rot: 0 and Page 2 rot: 0?
Do you notice the lines saying Page 1 size: 595 x 842 pts (A4) and Page 2 size: 842 x 595 pts (A4) and the differences between the two?
XPDF's pdfinfo output
Now let's compare this to the output of XPDF's pdfinfo:
kp#mbp:~$ xpdf-pdfinfo -f 1 -l 2 a4portrait+landscape.pdf
Producer: GPL Ghostscript 9.05
CreationDate: Thu Jul 26 14:23:31 2012
ModDate: Thu Jul 26 14:23:31 2012
Tagged: no
Pages: 2
Encrypted: no
Page 1 size: 595 x 842 pts (A4)
Page 2 size: 842 x 595 pts (A4)
File size: 3100 bytes
Optimized: no
PDF version: 1.4
You may notice one more difference, if you look closely enough. I won't point my finger to it, and will keep my mouth shut for now... :-)
Poppler's pdfinfo correctly reports rotation of page 2
Next, I rotate the second page of the file by 90 degrees using pdftk (I don't have Adobe Acrobat around):
pdftk \
a4portrait+landscape.pdf \
cat 1 2E \
output a4portrait+landscape---page2-landscaped-by-pdftk.pdf
Now Poppler's pdfinfo reports this:
kp#mbp:~$ pdfinfo -f 1 -l 2 a4portrait+landscape---page2-landscaped-by-pdftk.pdf
Creator: pdftk 1.44 - www.pdftk.com
Producer: itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate: Thu Jul 26 14:39:47 2012
ModDate: Thu Jul 26 14:39:47 2012
Tagged: no
Form: none
Pages: 2
Encrypted: no
Page 1 size: 595 x 842 pts (A4)
Page 1 rot: 0
Page 2 size: 842 x 595 pts (A4)
Page 2 rot: 90
File size: 1759 bytes
Optimized: no
PDF version: 1.4
As you can see, the line Page 2 rot: 90 tells us what we are looking for. XPDF's pdfinfo would essentially report the same info about the changed file as it does about the original one. Of course, it would still correctly capture the changed Creator:, Producer: and *Date: infos, but it would miss the rotated page...
Also note this detail: page 2 originally was designed as a landscape page, which can be seen from the Page 2 size: 842 x 595 pts (A4) info part. However, it shows up in the current PDF as a portrait page, as can be seen by the Page 2 rot: 90 part.
Also note that there are 4 different values that could appear for the rotation info:
0 (no rotation),
90 (rotation to the East, or 90 degrees clockwise),
180 (rotation to the South, tumbled page image, upside-down, or 180 degrees clockwise),
270 (rotation to the West, or 90 degrees counter-clockwise, or 270 degrees clockwise).
Some Background Info
Popper (developed by The Poppler Developers) is a fork of XPDF (developed by Glyph & Cog LLC), that happened around 2005. (As one of their important reason for their forking the Poppler developer at the time gave: Glyph & Cog didn't always provide timely bugfixes for security-related problems...)
Anyway, the Poppler fork for a very long time kept the associated commandline utilities, their commandline parameters and syntax as well as the format of their output compatible to the original (XPDF/Glyph & Cog LLC) ones.
Existing Poppler tools gaining additional features over competing XPDF tools
However, more recently they started to add additional features. Out of the top of my head:
pdfinfo now also reports the rotation status of each page (starting with Poppler v0.19.0, released March 1st, 2012).
pdffonts now also reports the font encoding for each font (starting with Poppler v0.19.1, released March 15th, 2012).
Poppler tools getting more siblings
The Poppler tools also provide some extra commandline utilities which are not in the original XPDF package (some of which have been added only quite recently):
pdftocairo - utility for creating PNG, JPEG, PostScript, EPS, PDF, SVG (using Cairo)
pdfseparate - utility to extract PDF pages
pdfunite - utility to merge PDF files
pdfdetach - utility to list or extract embedded files from a PDF
pdftohtml - utility to convert HTML from PDF files
identify which comes with ImageMagick will give you the width and height of a given PDF file (it also requires GhostScript be installed on the system).
$ identify -format "%g\n" FILENAME.PDF
1417x1106+0+0
Where 1417 is the width, 1106 is the height, and you (for this purpose) can ignore the +0+0.
Edit: Sorry, I was referring to Mike B's comment on the original question - as he said, after knowing the width and height you can determine if you have a portrait or landscape image (if height > width then portrait else landscape).
Also, the \n added to the -format argument (as suggested by Kurt Pfeifle) will separate each page into its own line. He also mentions the %W and %H format parameters; all the possible format parameters can be found here (there are a lot of them).
I am trying to convert a PDF to a JPG with a PHP exec() call, which looks like this:
convert page.pdf -resize 716x716 page.jpg
For some reason, the JPG comes out with janky text, despite the PDF looking just fine in Acrobat and Mac Preview. Here is the original PDF:
http://whit.info/dev/conversion/page.pdf
and here is the janktastic output:
http://whit.info/dev/conversion/page.jpg
The server is a LAMP stack with PHP 5 and ImageMagick 6.2.8.
Can you help this stumped Geek?
Thanks in advance,
Whit
ImageMagick is just going to call out to Ghostscript to convert this PDF to an image. If you run gs on the pdf, you get the same badly-spaced output.
I suspect Ghostscript isn't handling the PDF's embedded TrueType fonts very well. If you could change your output to either embed Type 1 fonts or use a "core" PostScript font, you'd get better results.
I suspect its an encoding/widths issue. Both are a tad off, though I can't put my finger on why.
Here are some suspects:
First
The text stream is defined in UTF-16 LE. charNULLcharNULL, using the normal string drawing command syntax:
(some text) Tj
There's a way to escape any old character value into a () string. You can also define strings in hex thusly:
<203245> Tj
Neither method are used, just the questionable inline nulls. That could cause an issue in GS if it's trying to work with pointers to char without lengths associated with them.
Second
The widths array is dumb. You can define widths in groups thusly:
[ 32 [450 525 500] 37 [600 250] 40 [0] ]
This defines
32: 450
33: 525
34: 500
37: 600
38: 250
40: 0
These fonts defines their consecutive widths in individual arrays. Not illegal, but definitely wasteful/stupid, and if GS were coded to EXPECT gaps between the arrays, it could induce a bug.
There's also some extremely fishy values in the array. 32 through 126 are defined consecutively, but then it starts jumping all over: ...126 [600] 8364 [500] 8216 [222] 402 [500] 8222 [389]. 8230 [1000] 8224 [444]... and then goes back to being consecutive from 160 to 255.
Just weird.
Third
I'm not even remotely sure, but the CIDToGIDMap stream contains an AWEFUL lot of nulls.
Bottom line
Those fonts are fishy. And I've never heard of "Bellflower Books" or "UFPDF 0.1"
That version number makes me cringe. It should make you cringe too.
Googleing for "UFPDF" I found this note from the author:
Note: I wrote UFPDF as an experiment, not as a finished product. If you have problems using it, don't bug me for support. Patches are welcome though, but I don't have much time to maintain this.
UFPDF is a PHP library that sits on top of FPDF. 0.1. Just run away.
Apologies in advance is I'm misusing terminology, and corrections are appreciated. I'm fascinated by directed graphs, but I never has the math/cs background to know what they're really about, I just like the tech because it makes useful diagrams.
I'm trying to create a web application feature that will render a dynamic directed graph to the browser. I recently discovered Canviz, which is a cavas based xdot renderer, which I'd like to use.
Canviz is awesome, but it renders xdot files, which (appear?) to contain all the complicated positioning logic
/* example xdot file */
digraph abstract {
graph [size="6,6"];
node [label="\N"];
graph [bb="0,0,1250,612",
_draw_="c 9 -#ffffffff C 9 -#ffffffff P 4 0 -1 0 612 1251 612 1251 -1 ",
xdotversion="1.2"];
S1 [pos="464,594", width="0.75", height="0.5", _draw_="c 9 -#000000ff e 464 594 27 18 ", _ldraw_="F 14.000000 11 -Times-Roman c 9 -#000000ff T 464 588 0 15 2 -S1 "];
10 [pos="409,522", width="0.75", height="0.5", _draw_="c 9 -#000000ff e 409 522 27 18 ", _ldraw_="F 14.000000 11 -Times-Roman c 9 -#000000ff T 409 516 0 15 2 -10 "];
S1 -> 10 [pos="e,421.43,538.27 451.52,577.66 444.49,568.46 435.57,556.78 427.71,546.5", _draw_="c 9 -#000000ff B 4 452 578 444 568 436 557 428 546 ", _hdraw_="S 5 -solid c 9 -#000000ff C 9 -#000000ff P 3 430 544 421 538 425 548 "];
}
The files I'm generating with my application are dot files, which contain none of this positioning logic
digraph g {
ranksep=6
node [
fontsize = "16"
shape = "rectangle"
width =3
height =.5
];
edge [
];
S1 -> 10
}
I'm looking for a PHP library that can convert my dot file into an xdot file that can be consumed by Canviz. I realize that the command line program dot can do this, but this is for a redistributable PHP web application, and I'd prefer to avoid any binaries as dependencies.
My core problem: I'm generating dot files based on simple directed relationships, and I want to display the visual graph to end users in a browser. I'd like to do this without having to rely on the presence of a particular binary program on the server. I think the best solution for this is Canviz+PHP to generate xdot files. I'm looking for a PHP library that can do this. However, I'm more than open to other solutions.
Have you looked at Image_GraphViz ? It's really just a wrapper for the binary, but from the look of things, I don't think you'll find something better and this at least keeps you from having to do direct command line calls from your PHP script.
$dot_obj = new Image_GraphViz();
$dot_obj -> load('path/to/graph.gv');
$xdot = $dot_obj -> fetch('xdot');