Lisp 操作 C 结构体

概述

最终，处理由 C 或 C++ 程序生成的 struct 包括将该结构加载到缓冲区中，并取出各个字段。类似地，发送这样的结构体涉及组合一个缓冲区。

在 Lisp 中使用字符串或字符时，需要进行一些类型转换。

这里首选的方法是使用 '(unsigned-byte 8) 来处理来自或发送到 C 结构体的比特。

虽然字符数组在 C 等其他语言中可能很便宜，或者允许利用 ASCII 字符集的事实上的特性，但对于 ANSI Common Lisp 来说，这是错误的做法。

ANSI 规范不保证字符将包含 ASCII 或 ISO-8859-1 或类似字符集。实际上，ANSI 规范仅保证字符代码最多为 96 [1]，尽管当代实现可能支持 Unicode。

与建议的文件和目录类似，只需读取或写入无符号字节向量。

Lisp 程序中其他任何需要处理实际字符串或字符的部分，都应该在读取后根据需要进行转换。

读取

(defun read-c-file (&optional (file-path "data.struct") (max-length 48))
  (with-open-file (stream (merge-pathnames file-path)
			  :element-type '(unsigned-byte 8)
			  :direction :input)
    (let ((buffer (make-array max-length
			      :element-type '(unsigned-byte 8)
			      :fill-pointer t)))
      (let ((actual-length (read-sequence buffer stream
					  :end max-length)))
	(setf (fill-pointer buffer) actual-length)
	(format t "received=~a max=~a buffer=~s~%" actual-length max-length buffer))
      buffer)))

使用数组的 fill-pointer 是可选的，但建议这样做，可以帮助跟踪实际接收到的长度，这可能与尝试读取的长度不同。

写入

(defun write-C-file (buffer &optional length (file-path "data.struct"))
  (unless length
    (setf length (length buffer)))
  (with-open-file (stream (merge-pathnames file-path)
			  :element-type '(unsigned-byte 8)
			  :direction :output
			  :if-exists :rename)
    (let ((written (length (write-sequence buffer stream))))
      (format t "wrote=~a bytes buffer=~s~%" written buffer)))
  buffer)

处理

在处理 '(unsigned-byte 8) 元素的向量时，根据字节偏移量，根据需要转换对应 C 结构体的每个字段。（注意，它实际上是一个向量，尽管是用 make-array 创建的。区别在于它只有一个维度。）

从原始字节中提取字符串

(map 'string #'code-char
     (subseq buffer *start-index* *end-index*))

只提取一个字节

(subseq buffer *state-index* (1+ *state-index*))

当然，你需要将上面两个示例中得到的每个值的返回值赋值。

同时，将值赋值到原始字节缓冲区中

(setf (elt buffer *magic-number-index*) (logand #xFF *preamble-value*))

保护要赋值的内容很重要；使用像 logand 这样的位掩码很适合这种情况。

对于不止一个字节，比如将sequence2 插入到sequence1 中成为子集，可以使用

(replace sequence1 (map 'vector #'char-code sequence2)
	 :start1 a :end1 b)

或者遵循上面的示例

(replace buffer (map '(vector '(unsigned-byte 8)) #'char-code string-text)
	 :start1 *start-index* :end1 *end-index*)

下面提供了一个辅助函数，可以避免创建中间的 vector。

辅助函数

(defun map-replace (fn sequence1 sequence2 &key (start1 0) end1 (start2 0) end2)
  "Alter elements of first sequence with those from second but after applying function
to that element first, performing each element in order.

Results will be identical to the following but without creating
intermediate vector:
  (replace sequence1 (map 'vector #'char-code sequence2) :start1 start1 :end1 end1)

See also: http://common-lisp.net/project/trivial-utf-8

Side-effects: sequence1 gets modified unless sequence2 is effectively nil.
Returns sequence1 after all modifications.
"
  (loop
     for i upfrom start1 below (or end1 (length sequence1))
     and j upfrom start2 below (or end2 (length sequence2))
     do (setf (elt sequence1 i) (funcall fn (elt sequence2 j))))
  sequence1)

(defun network-bytes-to-number (buffer start-index total-bits)
  "Convert network byte ordered sequence of unsigned bytes to a number."
  (unless (= (mod total-bits 8) 0)
    (error "Please specify total-bits as total for multiples of eight bit bytes"))
  (let ((value 0))
    (loop for i downfrom (- total-bits 8) downto 0 by 8
       for cursor upfrom start-index
       do (setf value (dpb (elt buffer cursor)
			   (byte 8 i) value))

	 (format t "buffer[~d]==#x~2X; shift<< ~d bits; value=~d~%"
		 cursor (elt buffer cursor) i value))
    value))

(defun number-to-network-bytes (number total-bits &optional buffer (start-index 0))
  "Convert number to network byte ordered sequence of unsigned bytes characters."
  (unless (= (mod total-bits 8) 0)
    (error "Please specify total-bits as total for multiples of eight bit bytes"))
  (unless buffer
    (setf buffer (make-array (/ total-bits 8) :element-type '(unsigned-byte 8))))
  (loop for i downfrom (- total-bits 8) downto 0 by 8
     for cursor upfrom start-index
     do (setf (elt buffer cursor) (ldb (byte 8 i) number))

       (let ((value (ldb (byte 8 i) number)))
	 (format t "number=~d: shift>> ~d bits; value=~d #x~2X; buffer[~d]==#x~2X~%"
		 number i value value cursor (elt buffer cursor))))
  buffer)

时间 & 纪元

如果要从其他语言（更不用说操作系统）转换时间值，请注意纪元（0 值的语义）可能会有所不同。

ANSI Common Lisp 的纪元是 UTC 1900 年 1 月 1 日午夜，值为 0，而 Unix 和许多 C 库使用的是 1970 年 1 月 1 日。简单的算术运算可以在两者之间进行转换。