Shell 生成非重复字符单词列表的脚本

Shell Script to generate word list with non repeating characters

我找到了一个 lisp 程序。它有效,但也不完全是我需要它的方式。它输出如下:

2323232323235ve3
2323232323235ve4
2323232323235ve5
2323232323235ve6

我需要制作一个单词列表,单词长度为16个字符,使用Base32字符,单词中不包含重复字符。 然后我需要在单词中添加.txt

asdfjklwert7csui.txt
jcfklinesftw8se3.txt

然后我需要对单词进行 SHA512 并根据已知哈希值进行检查。

是否可以只输出与已知哈希匹配的单词?

这是 LISP 源代码

#!/usr/bin/clisp

(defparameter *character-set* "234567abcdefghijklmnopqrstuvwxyz")
;(defparameter *character-set* "ABC")     ; < --- this line is for testing

(defparameter *word-length* 16)
;(defparameter *word-length* 4)           ; < --- this line is for testing

(defparameter *character-list*
       (coerce *character-set* 'list))

(defun final-char (in-string)
   (cond
      ((> (length in-string) 0)
         (elt in-string (1- (length in-string))))
      (t
         nil)))

(defun new-char-list (in-string)
   (let ((result))
      (mapcar
         (lambda (candidate)
            (cond
               ((not (eql candidate (final-char in-string)))
                  (push candidate result))))
         *character-list*)
      (nreverse result))
      )

(defun extend-string (in-string desired-length)
   (mapcar
      (lambda (new-char)
         (let ((new-string (concatenate 'string in-string (string new-char))))
            (cond
               ((>  (length new-string) desired-length))
               ((>= (length new-string) desired-length)
                  (format t "~a~%" new-string))
               (t
                  (extend-string new-string desired-length)))))
      (new-char-list in-string)))

(extend-string "" *word-length*)

Bash 脚本输出到文件看起来像。我需要小写的输出。

K5SMKLK5W85T6GTC
RZJRNV0VO1LVIMEM
RPSW59OPQLUBJKC5

这是Bash脚本

#!/bin/bash
ascii=
index=0
noNames=16                                              #No of names to generate
nameLength=10                                           #Length to generate (you said 10)
for(( i=65; i<=90; i++ ))                               #Add upper-case letters to 'ascii'
do
        ascii[$index]=$(echo $i | awk '{printf("%c",)}')
        index=$(( $index + 1 ))
done

for(( i=48; i<=57; i++ )) # Add numbers to 'ascii'
do
        ascii[$index]=$(echo $i | awk '{printf("%c",)}')
        index=$(( $index + 1))
done

for(( i=0; i<$noNames; i++))
do
    name=                                           #We'll store the name in here
    last=                                           #We'll store the index of the last 
                                                        #   character generated here
    for(( j=0; j<$nameLength; j++))
    do  
        num=$(( $RANDOM % $index ))             # Pick a random character index
        while [[ $num -eq $last ]]              #If it's the same as the last 
                                                        #  one...
        do
            num=$(( $RANDOM % $index ))     #... pick a new one!
        done
        last=$num                               #Update "last" to current value
            name=${name}${ascii[$num]}              #Add the correct letter to our name
    done
    echo "${name}"                                  #Print name...
done > output                                           #...to our output file

这是使用 SBCL 测试的 Common Lisp 答案。 由于您需要计算哈希值,因此我将使用名为 Ironclad 的外部库。为了安装它,首先 Install Quicklisp。那么:

(ql:quickload :ironclad)

这部分可以定制:

(defparameter *character-set* "234567abcdefghijklmnopqrstuvwxyz")
(defparameter *suffix* ".txt")

辅助功能

现在,我们将映射所有符合您的约束的可能字符串(没有相同的连续字符)。我们还将把这些字符串作为字节来操作,因为 Ironclad 仅从字节向量计算散列。没有必要分配那么多字符串,只是一遍又一遍地重复使用同一个缓冲区:

(defun make-buffer (size)
  (concatenate '(vector (unsigned-byte 8))
               (make-array size :element-type '(unsigned-byte 8))
               (ironclad:ascii-string-to-byte-array *suffix*)))

上面分配了所需的字节向量,考虑到后缀,转换为字节。 下面,我们将对字符集做同样的事情,它也被强制转换为一个列表(以便能够使用DOLIST):

(defun make-character-set ()
  (coerce (ironclad:ascii-string-to-byte-array *character-set*)
          'list))

我们还希望能够将哈希字符串转换为字节向量,但也可以直接接受向量。以下函数确保将给定值转换为所需类型:

(defun ensure-hash (hash-designator)
  (etypecase hash-designator
    (string (ironclad:hex-string-to-byte-array hash-designator))
    (vector (coerce hash-designator '(vector (unsigned-byte 8))))))

查找哈希

现在,我们可以找到给定一组生成的单词的散列。 SIZE 参数表示后缀前有多少个字符,HASH-DESIGNATOR 是十六进制表示的字符串,或字节向量:

(defun find-hash (size hash-designator)
  (let ((hash (ensure-hash hash-designator))
        (buffer (make-buffer size))
        (character-set (make-character-set)))
    (labels ((level (depth forbidden)
               (cond
                 ((>= depth size)
                  (when (equalp hash (ironclad:digest-sequence
                                      'ironclad:sha512 buffer))
                    (return-from find-hash
                      (values (map 'string #'code-char buffer)
                              buffer))))
                 (t (let ((next (1+ depth)))
                      (dolist (c character-set)
                        (unless (= c forbidden)
                          (setf (aref buffer depth) c)
                          (level next c))))))))
      (level 0 0))))

局部level函数的一般情况是根据字符集设置buffer中depth位置的字符,忽略禁止字符,也就是最后一个设置(或最初为零)。当 level 达到 size 时,我们将单词作为字节向量存储在缓冲区中。在这种情况下,我们对该词进行哈希处理并将其与所需的哈希值进行比较。如果匹配,我们将字节数组(字符代码)转换为字符串,并且 return 内部缓冲区(已经计算,也许可以重用)。

例子

(find-hash 3 "ddd2379f9a1adf4f0afa0befafdb070fb942d4d4e0331a31d43494149307221e5e699da2a08f59144b0ed415dea6f920cf3dab8ca0b740d874564d83b9b6f815")
=> "zyc.txt"
   #(122 121 99 46 116 120 116)

然而,由于指数级的复杂性,该任务对于 16 个字符来说是不切实际的:

> (time (find-hash 4 #(0)))
Evaluation took:
  1.679 seconds of real time
  1.676000 seconds of total run time (1.672000 user, 0.004000 system)
  [ Run times consist of 0.028 seconds GC time, and 1.648 seconds non-GC time. ]
  99.82% CPU
  4,019,832,288 processor cycles
  899,920,096 bytes consed

NIL

> (time (find-hash 5 #(0)))
Evaluation took:
  51.768 seconds of real time
  51.796000 seconds of total run time (51.684000 user, 0.112000 system)
  [ Run times consist of 0.952 seconds GC time, and 50.844 seconds non-GC time. ]
  100.05% CPU
  123,956,130,558 processor cycles
  27,897,672,624 bytes consed