Shell 生成非重复字符单词列表的脚本

Question

我找到了一个 lisp 程序。它有效，但也不完全是我需要它的方式。它输出如下：

2323232323235ve3
2323232323235ve4
2323232323235ve5
2323232323235ve6

我需要制作一个单词列表，单词长度为16个字符，使用Base32字符，单词中不包含重复字符。然后我需要在单词中添加.txt。

asdfjklwert7csui.txt
jcfklinesftw8se3.txt

然后我需要对单词进行 SHA512 并根据已知哈希值进行检查。

是否可以只输出与已知哈希匹配的单词？

这是 LISP 源代码

#!/usr/bin/clisp

(defparameter *character-set* "234567abcdefghijklmnopqrstuvwxyz")
;(defparameter *character-set* "ABC")     ; < --- this line is for testing

(defparameter *word-length* 16)
;(defparameter *word-length* 4)           ; < --- this line is for testing

(defparameter *character-list*
       (coerce *character-set* 'list))

(defun final-char (in-string)
   (cond
      ((> (length in-string) 0)
         (elt in-string (1- (length in-string))))
      (t
         nil)))

(defun new-char-list (in-string)
   (let ((result))
      (mapcar
         (lambda (candidate)
            (cond
               ((not (eql candidate (final-char in-string)))
                  (push candidate result))))
         *character-list*)
      (nreverse result))
      )

(defun extend-string (in-string desired-length)
   (mapcar
      (lambda (new-char)
         (let ((new-string (concatenate 'string in-string (string new-char))))
            (cond
               ((>  (length new-string) desired-length))
               ((>= (length new-string) desired-length)
                  (format t "~a~%" new-string))
               (t
                  (extend-string new-string desired-length)))))
      (new-char-list in-string)))

(extend-string "" *word-length*)

Bash 脚本输出到文件看起来像。我需要小写的输出。

K5SMKLK5W85T6GTC
RZJRNV0VO1LVIMEM
RPSW59OPQLUBJKC5

这是Bash脚本

#!/bin/bash
ascii=
index=0
noNames=16                                              #No of names to generate
nameLength=10                                           #Length to generate (you said 10)
for(( i=65; i<=90; i++ ))                               #Add upper-case letters to 'ascii'
do
        ascii[$index]=$(echo $i | awk '{printf("%c",)}')
        index=$(( $index + 1 ))
done

for(( i=48; i<=57; i++ )) # Add numbers to 'ascii'
do
        ascii[$index]=$(echo $i | awk '{printf("%c",)}')
        index=$(( $index + 1))
done

for(( i=0; i<$noNames; i++))
do
    name=                                           #We'll store the name in here
    last=                                           #We'll store the index of the last 
                                                        #   character generated here
    for(( j=0; j<$nameLength; j++))
    do  
        num=$(( $RANDOM % $index ))             # Pick a random character index
        while [[ $num -eq $last ]]              #If it's the same as the last 
                                                        #  one...
        do
            num=$(( $RANDOM % $index ))     #... pick a new one!
        done
        last=$num                               #Update "last" to current value
            name=${name}${ascii[$num]}              #Add the correct letter to our name
    done
    echo "${name}"                                  #Print name...
done > output                                           #...to our output file

Answer 1

这是使用 SBCL 测试的 Common Lisp 答案。由于您需要计算哈希值，因此我将使用名为 Ironclad 的外部库。为了安装它，首先 Install Quicklisp。那么：

(ql:quickload :ironclad)

这部分可以定制：

(defparameter *character-set* "234567abcdefghijklmnopqrstuvwxyz")
(defparameter *suffix* ".txt")

辅助功能

现在，我们将映射所有符合您的约束的可能字符串（没有相同的连续字符）。我们还将把这些字符串作为字节来操作，因为 Ironclad 仅从字节向量计算散列。没有必要分配那么多字符串，只是一遍又一遍地重复使用同一个缓冲区：

(defun make-buffer (size)
  (concatenate '(vector (unsigned-byte 8))
               (make-array size :element-type '(unsigned-byte 8))
               (ironclad:ascii-string-to-byte-array *suffix*)))

上面分配了所需的字节向量，考虑到后缀，转换为字节。下面，我们将对字符集做同样的事情，它也被强制转换为一个列表（以便能够使用DOLIST）：

(defun make-character-set ()
  (coerce (ironclad:ascii-string-to-byte-array *character-set*)
          'list))

我们还希望能够将哈希字符串转换为字节向量，但也可以直接接受向量。以下函数确保将给定值转换为所需类型：

(defun ensure-hash (hash-designator)
  (etypecase hash-designator
    (string (ironclad:hex-string-to-byte-array hash-designator))
    (vector (coerce hash-designator '(vector (unsigned-byte 8))))))

查找哈希

现在，我们可以找到给定一组生成的单词的散列。 SIZE 参数表示后缀前有多少个字符，HASH-DESIGNATOR 是十六进制表示的字符串，或字节向量：

(defun find-hash (size hash-designator)
  (let ((hash (ensure-hash hash-designator))
        (buffer (make-buffer size))
        (character-set (make-character-set)))
    (labels ((level (depth forbidden)
               (cond
                 ((>= depth size)
                  (when (equalp hash (ironclad:digest-sequence
                                      'ironclad:sha512 buffer))
                    (return-from find-hash
                      (values (map 'string #'code-char buffer)
                              buffer))))
                 (t (let ((next (1+ depth)))
                      (dolist (c character-set)
                        (unless (= c forbidden)
                          (setf (aref buffer depth) c)
                          (level next c))))))))
      (level 0 0))))

局部level函数的一般情况是根据字符集设置buffer中depth位置的字符，忽略禁止字符，也就是最后一个设置（或最初为零）。当 level 达到 size 时，我们将单词作为字节向量存储在缓冲区中。在这种情况下，我们对该词进行哈希处理并将其与所需的哈希值进行比较。如果匹配，我们将字节数组（字符代码）转换为字符串，并且 return 内部缓冲区（已经计算，也许可以重用）。

例子

(find-hash 3 "ddd2379f9a1adf4f0afa0befafdb070fb942d4d4e0331a31d43494149307221e5e699da2a08f59144b0ed415dea6f920cf3dab8ca0b740d874564d83b9b6f815")
=> "zyc.txt"
   #(122 121 99 46 116 120 116)

然而，由于指数级的复杂性，该任务对于 16 个字符来说是不切实际的：

> (time (find-hash 4 #(0)))
Evaluation took:
  1.679 seconds of real time
  1.676000 seconds of total run time (1.672000 user, 0.004000 system)
  [ Run times consist of 0.028 seconds GC time, and 1.648 seconds non-GC time. ]
  99.82% CPU
  4,019,832,288 processor cycles
  899,920,096 bytes consed

NIL

> (time (find-hash 5 #(0)))
Evaluation took:
  51.768 seconds of real time
  51.796000 seconds of total run time (51.684000 user, 0.112000 system)
  [ Run times consist of 0.952 seconds GC time, and 50.844 seconds non-GC time. ]
  100.05% CPU
  123,956,130,558 processor cycles
  27,897,672,624 bytes consed

Shell 生成非重复字符单词列表的脚本

Shell Script to generate word list with non repeating characters

lisp

shell

words

辅助功能

查找哈希

例子